join(x, y, by = intersect(names(x), names(y)), type =
"left", match = "all")
"first"
matching row, or match
"all"
matching rows.
inner
: only rows with matching
keys in both x and yleft
: all rows in x,
adding matching columns from yright
: all
rows in y, adding matching columns from xfull
: all rows in x with matching columns in y,
then the rows of y that don't match x. Note that from plyr 1.5, join
will (by default)
return all matches, not just the first match, as it did
previously.
Unlike merge, preserves the order of x no matter what join type is used. If needed, rows from y will be added to the bottom. Join is often faster than merge, although it is somewhat less featureful - it currently offers no way to rename output or merge on different variables in the x and y data frames.
first <- ddply(baseball, "id", summarise, first = min(year))
system.time(b2 <- merge(baseball, first, by = "id", all.x = TRUE))
system.time(b3 <- join(baseball, first, by = "id"))
b2 <- arrange(b2, id, year, stint)
b3 <- arrange(b3, id, year, stint)
stopifnot(all.equal(b2, b3))
Run the code above in your browser using DataLab