x and
y should usually be from the same data source, but if copy is
TRUE, y will automatically be copied to the same source as
x - this may be an expensive operation.inner_join(x, y, by = NULL, copy = FALSE, ...)left_join(x, y, by = NULL, copy = FALSE, ...)
semi_join(x, y, by = NULL, copy = FALSE, ...)
anti_join(x, y, by = NULL, copy = FALSE, ...)
NULL, the
default, join will do a natural join, using all variables with
common names across the two tables. A message lists the variables so
that you can check they're right. To join by different variables on x and y use a named vector.
For example, by = c("a" = "b") will match x.a to
y.b.
x and y are not from the same data source,
and copy is TRUE, then y will be copied into the
same src as x. This allows you to join tables across srcs, but
it is a potentially expensive operation so you must opt into it.Currently dplyr supports four join types:
inner_joinleft_joinsemi_joinA semi join differs from an inner join because an inner join will return
one row of x for each matching row of y, where a semi
join will never duplicate rows of x.
anti_joinGroups are ignored for the purpose of joining, but the result preserves
the grouping of x.