Learn R Programming

rsdv (version 0.2.0)

diagnostic_report: Generate a diagnostic (validity) report for synthetic data

Description

Checks whether synthetic data is structurally valid against the real data and metadata — independent of how closely it matches the real distributions (that is the job of quality_report()). Mirrors the SDMetrics DiagnosticReport two-property hierarchy:

Usage

diagnostic_report(real, synthetic, metadata)

Value

An rsdv_diagnostic_report object.

Arguments

real

A data frame of real data.

synthetic

A data frame of synthetic data.

metadata

An rsdv_metadata object.

Details

  • Data Validity — per-column checks:

    • numerical: boundary adherence (fraction of values within the real min/max range),

    • categorical: category adherence (fraction of values whose category was seen in the real data),

    • boolean: always valid,

    • primary key: key uniqueness (all values unique and non-missing).

  • Data Structure — fraction of expected columns present in the synthetic data.

Missing (NA) values are excluded from adherence denominators, since missingness is modeled separately.

Examples

Run this code
# \donttest{
meta  <- metadata(adult_income)
syn   <- gaussian_copula_synthesizer(meta) |> fit(adult_income)
synth <- sample(syn, n = 500)
diagnostic_report(adult_income, synth, meta)
# }

Run the code above in your browser using DataLab