contingency_similarity: Contingency similarity between real and synthetic categorical column pairs
Description
For each pair of categorical columns, compares the joint (normalized
contingency) distributions of real and synthetic data via total variation
distance, scoring 1 - TVD (the SDMetrics ContingencySimilarity score).
This is the categorical analogue of correlation similarity and captures
categorical-vs-categorical dependence.
Usage
contingency_similarity(real, synthetic, meta)
Value
A list with pairs (a tibble of column_1, column_2, score) and
score (the mean over pairs). score is NA_real_ when there are fewer
than two categorical columns — there is no dependence to measure, so
propagating NA (rather than 1) avoids overstating fidelity in the
aggregated quality report.