Generates a synthetic copy of data, then optionally detects/handles
sensitive columns by name. Detection uses the ORIGINAL column names and
maps to output via attr(fake, "name_map") if present.
generate_fake_with_privacy(
data,
n = 30,
level = c("low", "medium", "high"),
seed = NULL,
sensitive = NULL,
sensitive_detect = TRUE,
sensitive_strategy = c("fake", "drop"),
normalize = TRUE,
sensitive_patterns = NULL,
sensitive_regex = NULL
)data.frame with attributes: sensitive_columns, dropped_columns, name_map
A data.frame (or coercible) to mirror.
Rows to generate (default same as input if NULL).
One of "low","medium","high".
Optional RNG seed.
Character vector of original column names to treat as sensitive.
Logical; auto-detect common sensitive columns by name.
One of "fake" or "drop".
Logical; lightly normalize inputs.
Optional named list of patterns to treat as sensitive (e.g., list(id = "...", email = "...", phone = "...")). Overrides defaults.
Optional fully-combined regex (single string) to detect sensitive columns by name. If supplied, it is used instead of defaults.
Generate fake data with privacy controls