f_duplicates: Find duplicate rows

Description

Find duplicate rows

Usage

f_duplicates(
  .data,
  ...,
  .keep_all = FALSE,
  .both_ways = FALSE,
  .add_count = FALSE,
  .drop_empty = FALSE,
  .order = FALSE,
  .sort = deprecated(),
  .by = NULL,
  .cols = NULL
)

Value

A data.frame of duplicate rows.

Arguments

.data: A data frame.
...: Variables used to find duplicate rows.
.keep_all: If TRUE then all columns of data frame are kept, default is FALSE.
.both_ways: If TRUE then duplicates and non-duplicate first instances are retained. The default is FALSE which returns only duplicate rows.
Setting this to TRUE can be particularly useful when examining the differences between duplicate rows.
.add_count: If TRUE then a count column is added to denote the number of duplicates (including first non-duplicate instance). The naming convention of this column follows dplyr::add_count().
.drop_empty: If TRUE then empty rows with all NA values are removed. The default is FALSE.
.order: Should the groups be calculated as ordered groups? Setting to TRUE here implies that the groups are returned sorted.
.sort: Use .order instead.
.by: (Optional). A selection of columns to group by for this operation. Columns are specified using tidy-select.
.cols: (Optional) alternative to ... that accepts a named character vector or numeric vector. If speed is an expensive resource, it is recommended to use this.

Details

This function works like dplyr::distinct() in its handling of arguments and data-masking but returns duplicate rows. In certain situations in can be much faster than data |> group_by()|> filter(n() > 1) when there are many groups.

Description

Usage

Value

Arguments

Details

See Also