data.frame
with identical values for the specified variables.For hunting duplicate records during data cleaning. Specify the data.frame and the variable combination to search for duplicates and get back the duplicated rows.
get_dupes(dat, ...)
Returns a data.frame with the full records where the specified variables have duplicated values, as well as a variable dupe_count
showing the number of rows sharing that combination of duplicated values. If the input data.frame was of class tbl_df
, the output is as well.
The input data.frame.
Unquoted variable names to search for duplicates. This takes a tidyselect specification.
get_dupes(mtcars, mpg, hp)
# or called with the magrittr pipe %>% :
mtcars %>% get_dupes(wt)
# You can use tidyselect helpers to specify variables:
mtcars %>% get_dupes(-c(wt, qsec))
mtcars %>% get_dupes(starts_with("cy"))
Run the code above in your browser using DataLab