Nested filtering joins filter rows from .nest_data
based on the presence or
absence of matches in y
:
nest_semi_join()
returns all rows from .nest_data
with a match in y
.
nest_anti_join()
returns all rows from .nest_data
without a match in y
.
nest_semi_join(.data, .nest_data, y, by = NULL, copy = FALSE, ...)nest_anti_join(.data, .nest_data, y, by = NULL, copy = FALSE, ...)
An object of the same type as .data
. Each object in the column .nest_data
will also be of the same type as the input. Each object in .nest_data
has
the following properties:
Rows are a subset of the input, but appear in the same order.
Columns are not modified.
Data frame attributes are preserved.
Groups are taken from .nest_data
. The number of groups may be reduced.
A data frame, data frame extension (e.g., a tibble), or a lazy data frame (e.g., from dbplyr or dtplyr).
A list-column containing data frames
A data frame, data frame extension (e.g., a tibble), or a lazy data frame (e.g., from dbplyr or dtplyr).
A character vector of variables to join by or a join specification
created with join_by()
.
If NULL
, the default, nest_*_join()
will perform a natural join, using
all variables in common across each object in .nest_data
and y
. A
message lists the variables so you can check they're correct; suppress the
message by supplying by
explicitly.
To join on different variables between the objects in .nest_data
and y
,
use a named vector. For example, by = c("a" = "b")
will match
.nest_data$a
to y$b
for each object in .nest_data
.
To join by multiple variables, use a vector with length >1. For example,
by = c("a", "b")
will match .nest_data$a
to y$a
and .nest_data$b
to
y$b
for each object in .nest_data
. Use a named vector to match
different variables in .nest_data
and y
. For example,
by = c("a" = "b", "c" = "d")
will match .nest_data$a
to y$b
and
.nest_data$c
to y$d
for each object in .nest_data
.
To perform a cross-join, generating all combinations of each object in
.nest_data
and y
, use by = character()
.
If .nest_data
and y
are not from the same data source and
copy = TRUE
then y
will be copied into the same src as .nest_data
.
(Need to review this parameter in more detail for applicability with nplyr)
One or more unquoted expressions separated by commas. Variable
names can be used if they were positions in the data frame, so expressions
like x:y
can be used to select a range of variables.
nest_semi_join()
and nest_anti_join()
are largely wrappers for
dplyr::semi_join()
and dplyr::anti_join()
and maintain the functionality
of semi_join()
and anti_join()
within each nested data frame. For more
information on semi_join()
or anti_join()
, please refer to the
documentation in dplyr
.
Other joins:
nest-mutate-joins
,
nest_nest_join()
gm_nest <- gapminder::gapminder %>% tidyr::nest(country_data = -continent)
gm_codes <- gapminder::country_codes %>% dplyr::slice_sample(n = 10)
gm_nest %>% nest_semi_join(country_data, gm_codes, by = "country")
gm_nest %>% nest_anti_join(country_data, gm_codes, by = "country")
Run the code above in your browser using DataLab