diagnostic_report: Generate a diagnostic (validity) report for synthetic data
Description
Checks whether synthetic data is structurally valid against the real data
and metadata — independent of how closely it matches the real distributions
(that is the job of quality_report()). Mirrors the SDMetrics
DiagnosticReport two-property hierarchy:
Usage
diagnostic_report(real, synthetic, metadata)
Value
An rsdv_diagnostic_report object.
Arguments
real
A data frame of real data.
synthetic
A data frame of synthetic data.
metadata
An rsdv_metadata object.
Details
Data Validity — per-column checks:
numerical: boundary adherence (fraction of values within the real
min/max range),
categorical: category adherence (fraction of values whose category was
seen in the real data),
boolean: always valid,
primary key: key uniqueness (all values unique and non-missing).
Data Structure — fraction of expected columns present in the synthetic
data.
Missing (NA) values are excluded from adherence denominators, since
missingness is modeled separately.