Learn R Programming

m61r (version 0.0.3)

join: Join two data.frames

Description

Join two data.frames.

Usage

left_join_(df, df2, by = NULL, by.x = NULL, by.y = NULL)

anti_join_(df, df2, by = NULL, by.x = NULL, by.y = NULL)

full_join_(df, df2, by = NULL, by.x = NULL, by.y = NULL)

inner_join_(df, df2, by = NULL, by.x = NULL, by.y = NULL)

right_join_(df, df2, by = NULL, by.x = NULL, by.y = NULL)

semi_join_(df, df2, by = NULL, by.x = NULL, by.y = NULL)

Arguments

df

data.frame

df2

data.frame

by

column names of the pivot of both data.frame 1 and data.frame 2 if they are identical. Otherwise, better to use by.x and by.y

by.x

column names of the pivot of data.frame 1

by.y

column names of the pivot of data.frame 2

Value

The functions return a data frame. The output has the following properties:

  • For functions left_join(), inner_join(), full_join(), and right_join(), output includes all df1 columns and all df2 columns. For columns with identical names in df1 and df2, a suffix '.x' and '.y' is added. For left_join(), all df1 rows with matching rows of df2 For inner_join(), a subset of df1 rows matching rows of df2. For full_join(), all df1 rows, with all df2 rows. For right_join(), all df2 rows with matching rows of df1.

  • For functions semi_join() and anti_join(), output include columns of df1 only. For semi_join(), all df1 rows with a match in df2. For anti_join(), a subset of df1 rows not matching rows of df2.

Examples

Run this code
# NOT RUN {
books <- data.frame(
             name = I(c("Tukey", "Venables", "Tierney","Ripley",
                   "Ripley", "McNeil", "R Core")),
             title = c("Exploratory Data Analysis",
                   "Modern Applied Statistics ...",
                   "LISP-STAT",
                   "Spatial Statistics", "Stochastic Simulation",
                   "Interactive Data Analysis",
                   "An Introduction to R"),
              other.author = c(NA, "Ripley", NA, NA, NA, NA,"Venables & Smith"))

authors <- data.frame(
               surname = I(c("Tukey", "Venables", "Tierney", "Ripley", "McNeil","Asimov")),
               nationality = c("US", "Australia", "US", "UK", "Australia","US"),
               deceased = c("yes", rep("no", 4),"yes"))

tmp <- left_join_(books,authors, by.x = "name", by.y = "surname")
head(tmp)

tmp <- inner_join_(books,authors, by.x = "name", by.y = "surname")
head(tmp)

tmp <- full_join_(books,authors, by.x = "name", by.y = "surname")
head(tmp)

tmp <- right_join_(books,authors, by.x = "name", by.y = "surname")
head(tmp)

tmp <- semi_join_(books,authors, by.x = "name", by.y = "surname")
head(tmp)

tmp <- anti_join_(books,authors, by.x = "name", by.y = "surname")
head(tmp)
# }

Run the code above in your browser using DataLab