dplyr (version 0.3)

join.tbl_df: Join data frame tbls.

Description

See join for a description of the general purpose of the functions.

Usage

# S3 method for tbl_df
inner_join(x, y, by = NULL, copy = FALSE, ...)

# S3 method for tbl_df left_join(x, y, by = NULL, copy = FALSE, ...)

# S3 method for tbl_df semi_join(x, y, by = NULL, copy = FALSE, ...)

# S3 method for tbl_df anti_join(x, y, by = NULL, copy = FALSE, ...)

Arguments

x,y
tbls to join
by
a character vector of variables to join by. If NULL, the default, join will do a natural join, using all variables with common names across the two tables. A message lists the variables so that you can check they're right - to suppress the message, supply a character vector.
copy
If y is not a data frame or tbl_df and copy is TRUE, y will be converted into a data frame
...
included for compatibility with the generic; otherwise ignored.

Examples

Run this code
if (require("Lahman")) {
batting_df <- tbl_df(Batting)
person_df <- tbl_df(Master)

uperson_df <- tbl_df(Master[!duplicated(Master$playerID), ])

# Inner join: match batting and person data
inner_join(batting_df, person_df)
inner_join(batting_df, uperson_df)

# Left join: match, but preserve batting data
left_join(batting_df, uperson_df)

# Anti join: find batters without person data
anti_join(batting_df, person_df)
# or people who didn't bat
anti_join(person_df, batting_df)
}

Run the code above in your browser using DataCamp Workspace