This is a method for the dplyr::filter()
generic.
See "Fallbacks" section for differences in implementation.
The filter()
function is used to subset a data frame,
retaining all rows that satisfy your conditions.
To be retained, the row must produce a value of TRUE
for all conditions.
Note that when a condition evaluates to NA
the row will be dropped,
unlike base subsetting with [
.
# S3 method for duckplyr_df
filter(.data, ..., .by = NULL, .preserve = FALSE)
A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.
<data-masking
> Expressions that
return a logical value, and are defined in terms of the variables in
.data
. If multiple expressions are included, they are combined with the
&
operator. Only rows for which all conditions evaluate to TRUE
are
kept.
<tidy-select
> Optionally, a selection of columns to
group by for just this operation, functioning as an alternative to group_by()
. For
details and examples, see ?dplyr_by.
Relevant when the .data
input is grouped.
If .preserve = FALSE
(the default), the grouping structure
is recalculated based on the resulting data, otherwise the grouping is kept as is.
There is no DuckDB translation in filter.duckplyr_df()
with no filter conditions,
nor for a grouped operation (if .by
is set).
These features fall back to dplyr::filter()
, see vignette("fallback")
for details.
df <- duckdb_tibble(x = 1:3, y = 3:1)
filter(df, x >= 2)
Run the code above in your browser using DataLab