compareDF (version 1.7.1)

compare_df: Compare Two dataframes

Description

Do a git style comparison between two data frames of similar columnar structure

Usage

compare_df(df_new, df_old, group_col, exclude = NULL, limit_html = 100,
  tolerance = 0, tolerance_type = "ratio", stop_on_error = TRUE,
  keep_unchanged = FALSE, color_scheme = c(addition = "green", removal
  = "red", unchanged_cell = "gray", unchanged_row = "deepskyblue"),
  html_headers = NULL, html_change_col_name = "chng_type",
  html_group_col_name = "grp")

Arguments

df_new

The data frame for which any changes will be shown as an addition (green)

df_old

The data frame for which any changes will be shown as a removal (red)

group_col

A character vector of a string of character vector showing the columns by which to group_by.

exclude

The columns which should be excluded from the comparison

limit_html

maximum number of rows to show in the html diff. >1000 not recommended

tolerance

The amount in fraction to which changes are ignored while showing the visual representation. By default, the value is 0 and any change in the value of variables is shown off. Doesn't apply to categorical variables.

tolerance_type

Defaults to 'ratio'. The type of comparison for numeric values, can be 'ratio' or 'difference'

stop_on_error

Whether to stop on acceptable errors on not

keep_unchanged

whether to preserve unchanged values or not. Defaults to FALSE

color_scheme

What color scheme to use for the HTML output. Should be a vector/list with named_elements. Default - c("addition" = "green", "removal" = "red", "unchanged_cell" = "gray", "unchanged_row" = "deepskyblue")

html_headers

A character vector of column names to be used in the table. Defaults to colnames.

html_change_col_name

Name of the change column to use in the HTML table. Defaults to chng_type.

html_group_col_name

Name of the group column to be used in the table (if there are multiple grouping vars). Defaults to grp.

Examples

Run this code
# NOT RUN {
old_df = data.frame(var1 = c("A", "B", "C"),
                    val1 = c(1, 2, 3))
new_df = data.frame(var1 = c("A", "B", "C"),
                    val1 = c(1, 2, 4))
ctable = compare_df(new_df, old_df, c("var1"))
print(ctable$comparison_df)
ctable$html_output
# }

Run the code above in your browser using DataCamp Workspace