x
and
y
should usually be from the same data source, but if copy
is
TRUE
, y
will automatically be copied to the same source as
x
- this may be an expensive operation.inner_join(x, y, by = NULL, copy = FALSE, ...)left_join(x, y, by = NULL, copy = FALSE, ...)
right_join(x, y, by = NULL, copy = FALSE, ...)
full_join(x, y, by = NULL, copy = FALSE, ...)
semi_join(x, y, by = NULL, copy = FALSE, ...)
anti_join(x, y, by = NULL, copy = FALSE, ...)
NULL
, the
default, join
will do a natural join, using all variables with
common names across the two tables. A message lists the variables so
that you can check they're right. To join by different variables on x and y use a named vector.
For example, by = c("a" = "b")
will match x.a
to
y.b
.
x
and y
are not from the same data source,
and copy
is TRUE
, then y
will be copied into the
same src as x
. This allows you to join tables across srcs, but
it is a potentially expensive operation so you must opt into it.Currently dplyr supports four join types:
inner_join
x
where there are matching
values in x
, and all columns from x
and y
. If there are multiple matches
between x
and y
, all combination of the matches are returned.left_join
x
, and all columns from x
and y
. Rows in x
with no match in y
will have NA
values in the new
columns. If there are multiple matches between x
and y
, all combinations
of the matches are returned.right_join
y
, and all columns from x
and y. Rows in y
with no match in x
will have NA
values in the new
columns. If there are multiple matches between x
and y
, all combinations
of the matches are returned.semi_join
x
where there are matching
values in y
, keeping just columns from x
.A semi join differs from an inner join because an inner join will return
one row of x
for each matching row of y
, where a semi
join will never duplicate rows of x
.
anti_join
x
where there are not
matching values in y
, keeping just columns from x
.full_join
x
and y
.
Where there are not matching values, returns NA
for the one missing.Groups are ignored for the purpose of joining, but the result preserves
the grouping of x
.