join

0th

Percentile

Join two data frames together.

Join, like merge, is designed for the types of problems where you would use a sql join.

Keywords
manip
Usage
join(x, y, by = NULL, type = "left", match = "all")
Arguments
x
data frame
y
data frame
by
character vector of variable names to join by. If omitted, will match on all common variables.
type
type of join: left (default), right, inner or full. See details for more information.
match
how should duplicate ids be matched? Either match just the "first" matching row, or match "all" matching rows. Defaults to "all" for compatibility with merge, but "first" is significantly faster.
Details

The four join types return:

  • inner: only rows with matching keys in both x and y
  • left: all rows in x, adding matching columns from y
  • right: all rows in y, adding matching columns from x
  • full: all rows in x with matching columns in y, then the rows of y that don't match x.

Note that from plyr 1.5, join will (by default) return all matches, not just the first match, as it did previously.

Unlike merge, preserves the order of x no matter what join type is used. If needed, rows from y will be added to the bottom. Join is often faster than merge, although it is somewhat less featureful - it currently offers no way to rename output or merge on different variables in the x and y data frames.

Aliases
  • join
Examples
library(plyr) first <- ddply(baseball, "id", summarise, first = min(year)) system.time(b2 <- merge(baseball, first, by = "id", all.x = TRUE)) system.time(b3 <- join(baseball, first, by = "id")) b2 <- arrange(b2, id, year, stint) b3 <- arrange(b3, id, year, stint) stopifnot(all.equal(b2, b3))
Documentation reproduced from package plyr, version 1.8.4, License: MIT + file LICENSE

Community examples

Looks like there are no examples yet.