This function compares a new and an old version of a data set to identify inserted, deleted, and updated records, as well as column-level changes. The comparison can be performed using a specified index column (or columns), or—if no index is provided—based on a full-row comparison across all common columns.
reconcile(new_df, old_df, index = NA, lookup_columns = NA)A data frame summarizing the reconciliation results. For each record, the output includes the current values, index variables, detected status (`"NEW"`, `"DELETED"`, `"UPDATED"`, `"UNCHANGED"`), the set of changed columns, and a human-readable description of the differences.
A data frame containing the most recent version of the data.
A data frame containing the preceding version of the data, used as the reference for comparison.
A character vector specifying the variable(s) that uniquely identify records (e.g., `"recordid"`). If `NA`, all common columns are used as the matching key, but some enhanced functionality (such as detecting newly added or removed rows) will not be available.
A character vector specifying which columns should be compared. By default `NA`, meaning that all columns common to both `new_df` and `old_df` are used. If specific column names are provided, comparisons are restricted to those columns.
Lukasz Andrzejewski
When `index` is supplied, rows are matched by the specified index variable(s), allowing the function to detect newly added records, removed records, and detailed field-level changes. When `index = NA`, the function falls back to a full reconciliation based on the auxiliary comparison routine, using all common columns as the key.
Column comparison is further controlled by `lookup_columns`: if this argument is left as `NA`, all columns common to `new_df` and `old_df` are evaluated; otherwise, only the specified subset of columns is compared.
reconcile(data.frame(col1 = c("AA", "B"), id = c(1, 2)),
data.frame(col1 = c("A", "B"), id = c(1, 3)), index = "id")
Run the code above in your browser using DataLab