merge
in Package merge
, cross_join
, and join
operations.## S3 method for class 'scidb':
merge(x,y, by=intersect(dimensions(x),dimensions(y)), by.x, by.y, merge, eval)
## S3 method for class 'scidbdf':
merge(x,y, by=intersect(dimensions(x),dimensions(y)), by.x, by.y, merge, eval)
scidb
or scidbdf
scidb
or scidbdf
x
to join on. See deails.y
to join on. See deails.scidb
or scidbdf
reference object.by
or both by.x
and by.y
may be
specified. If none of the by.x
,by.y
arguments are specified, and
by=NULL
the result is the Cartesian product of x
and y
.
The default value of by
performs a cross_join
or join
along common array dimensions.If only by
is specified, the dimension names or attribute name in
by
are assumed to be common across x
and y
. Otherwise
dimension names or attribute names are matched across the names listed in
by.x
and by.y
, respectively.
If dimension names are specified and by
contains all the dimensions
in each array, then the SciDB join
operator is used, otherwise SciDB's
cross_join
operator is used. In each either case, the output is a cross
product set of the two arrays along the specified dimensions.
If by
or each of by.x
and by.y
list a single dimension
name, the indicated attributes will be lexicographically ordered as categorical
variables and SciDB will redimension each array along new coordinate systems
defined by the attributes, and then those redimensioned arrays will be joined.
This method limits joins along attributes to a single attribute from
each array. The output array will contain additional columns showing the
attribute factor levels used to join the arrays.
This method is limited to SQL-like `natural joins`, a special
case of inner joins corresponding to the all=FALSE
case in
the standard R merge
function. A future version of this package
will include additional join cases.
Specify merge=TRUE
to perform a SciDB merge operation instead
of a SciDB join.
The various SciDB join
operators generally require that the arrays have
identical partitioning (coordinate system bounds, chunk size, etc.) in the
common dimensions. The merge
method attempts to rectify SciDB
arrays along the specified dimensions as required before joining.
# Create a copy of the iris data frame in a 1-d SciDB array named "iris."
# Note that SciDB attribute names will be changed to conform to SciDB
# naming convention.
x <- as.scidb(iris,name="iris")
a <- x$Species
b <- x$Petal_Length
c <- merge(a, b, by="row")
merge(b, b, by="row", merge=TRUE)
Run the code above in your browser using DataLab