Merges any number of species matrices on their common columns to create a new data set with number of columns equal to the number of unqiue columns across all data frames. Needed for analysis of fossil data sets with respect to training set samples.
join(…, verbose = FALSE, na.replace = TRUE, split = TRUE, value = 0,
type = c("outer", "left", "inner"))# S3 method for join
head(x, …)
# S3 method for join
tail(x, …)
logical; if TRUE
, the function prints out the
dimensions of the data frames in "\dots"
, as well as those of
the returned, merged data frame.
logical; samples where a column in one data frame
that have no matching column in the other will contain missing
values (NA
). If na.replace
is TRUE
, these
missing values are replaced with zeros. This is standard practice in
ecology and palaeoecology. If you want to replace with another
value, then set na.replace
to FALSE
and do the
replacement later.
logical; should the merged data sets samples be split back into individual data frames, but now with common columns (i.e. species)?
numeric; value to replace NA
with if
na.replace
is TRUE
.
logical; type of join to perform. "outer"
returns
the union of the variables in data frames to be merged, such
that the resulting objects have columns for all variables found
across all the data frames to be merged. "left"
returns the
left outer (or the left) join, such that the merged data frames
contain the set of variables found in the first supplied data
frame. "inner"
returns the inner join, such that the merged
data frame contain the intersection of the variables in the supplied
data frames. See Details.
an object of class "join"
, usually the result of a
call to join
.
If split = TRUE
, an object of class "join"
, a list of
data frames, with as many components as the number of data frames
originally merged.
Otherwise, an object of class c("join", "data.frame")
, a data
frame containing the merged data sets.
head.join
and tail.join
return a list, each component of
which is the result of a call to head
or
tail
on each data set compont of the joined object.
When merging multiple data frames the set of variables in the merged
data can be determined via a number of routes. join
provides
for two (currently) join types; the outer join and the
left outer (or simply the left) join. Which type of join
is performed is determined by the argument type
.
The outer join returns the union of the set of variables found in the data frames to be merged. This means that the resulting data frame(s) contain columns for all the variable observed across all the data frames supplied for merging.
With the left outer join the resulting data frame(s) contain only the set of variables found in the first data frame provided.
The inner join returns the intersection of the set of variables found in the supplied data frames. The resulting data frame(s) contains the variables common to all supplied data frames.
# NOT RUN {
## load the example data
data(swapdiat, swappH, rlgh)
## merge training and test set on columns
dat <- join(swapdiat, rlgh, verbose = TRUE)
## extract the merged data sets and convert to proportions
swapdiat <- dat[[1]] / 100
rlgh <- dat[[2]] / 100
## merge training and test set using left join
head(join(swapdiat, rlgh, verbose = TRUE, type = "left"))
## load the example data
data(ImbrieKipp, SumSST, V12.122)
## merge training and test set on columns
dat <- join(ImbrieKipp, V12.122, verbose = TRUE)
## extract the merged data sets and convert to proportions
ImbrieKipp <- dat[[1]] / 100
V12.122 <- dat[[2]] / 100
## show just the first few lines of each data set
head(dat, n = 4)
## show just the last few lines of each data set
tail(dat, n = 4)
## merge training and test set using inner join
head(join(ImbrieKipp, V12.122, verbose = TRUE, type = "inner"))
## merge training and test set using outer join and replace
## NA with -99.9
head(join(ImbrieKipp, V12.122, verbose = TRUE, value = -99.9))
# }
Run the code above in your browser using DataLab