find_duplicates: Identify and return duplicated rows in a data frame or linelist.
Description
Identify and return duplicated rows in a data frame or linelist.
Usage
find_duplicates(data, target_columns = NULL)
Value
A <data.frame> or <linelist> of all duplicated rows
with following 2 additional columns:
row_id
The indices of the duplicated rows from the input data.
Users can choose from these indices, which row they consider as
redundant in each group of duplicates.
group_id
a unique identifier associated to each group of
duplicates.
Arguments
data
The input <data.frame> or <linelist>.
target_columns
A <vector> of columns names or indices to
consider when looking for duplicates. When the input data is a
<linelist> object, this parameter can be set to
linelist_tags from which duplicates to be removed. Its default
value is NULL, which considers duplicates across all columns.