This is a method for the dplyr::filter() generic.
See "Fallbacks" section for differences in implementation.
The filter() function is used to subset a data frame,
retaining all rows that satisfy your conditions.
To be retained, the row must produce a value of TRUE for all conditions.
Note that when a condition evaluates to NA the row will be dropped,
unlike base subsetting with [.
# S3 method for duckplyr_df
filter(.data, ..., .by = NULL, .preserve = FALSE)A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.
<data-masking> Expressions that
return a logical value, and are defined in terms of the variables in
.data. If multiple expressions are included, they are combined with the
& operator. Only rows for which all conditions evaluate to TRUE are
kept.
<tidy-select> Optionally, a selection of columns to
group by for just this operation, functioning as an alternative to group_by(). For
details and examples, see ?dplyr_by.
Relevant when the .data input is grouped.
If .preserve = FALSE (the default), the grouping structure
is recalculated based on the resulting data, otherwise the grouping is kept as is.
There is no DuckDB translation in filter.duckplyr_df()
with no filter conditions,
nor for a grouped operation (if .by is set).
These features fall back to dplyr::filter(), see vignette("fallback") for details.
df <- duckdb_tibble(x = 1:3, y = 3:1)
filter(df, x >= 2)
Run the code above in your browser using DataLab