Given a data.frame or data.table object and a named list of id_vars, assert that all possible combinations of id_vars exist in the dataset, that no combinations of id_vars exist in the dataset but not in id_vars, and that there are no duplicate values within the dataset within unique combinations of id_vars. If ids_only = T and assert_dups = T, returns all combinations of id_vars along with the n_duplicates: the count of duplicates within each combination. If ids_only = F, returns all duplicate observations from the original dataset along with n_duplicates and duplicate_id: a unique ID for each duplicate value within each combination of id_vars.
assert_ids(data, id_vars, assert_combos = TRUE, assert_dups = TRUE,
ids_only = TRUE, warn_only = FALSE, quiet = FALSE)
A data.frame or data.table
A named list of vectors, where the name of each vector must correspond to a column in data
Assert that the data object must contain all combinations of id_vars. Default = T.
Assert that the data object must not contain duplicate values within any combinations of id_vars. Default = T.
By default, with assert_dups = T, the function returns the unique combinations of id_vars that have duplicate observations. If ids_only = F, will return every observation in the original dataset that are duplicates.
Do you want to warn, rather than error? Will return all offending rows from the first violation of the assertion. Default=F.
Do you want to suppress the printed message when a test is passed? Default = F.
Throws error if test is violated. Will print the offending rows. If warn_only=T, will return all offending rows and only warn.
Note: if assert_combos = T and is violated, then assert_ids will stop execution and return results for assert_combos before evaluating the assert_dups segment of the code. If you want to make sure both options are evaluated even in case of a violation in assert_combos, call assert_ids twice (once with assert_dups = F, then assert_combos = F) with warn_only = T, and then conditionally stop your code if either call returns results.
# NOT RUN {
plants <- as.character(unique(CO2$Plant))
concs <- unique(CO2$conc)
ids <- list(Plant=plants,conc=concs)
assert_ids(CO2, ids)
# }
Run the code above in your browser using DataLab