Generates fake data, writes files (CSV/RDS/Parquet), writes a scrubbed JSON schema, and optionally writes a README prompt and a single ZIP file containing everything.
llm_bundle(
data,
n = 30,
level = c("medium", "low", "high"),
formats = c("csv", "rds"),
path = tempdir(),
filename = "fake_bundle",
seed = NULL,
write_prompt = TRUE,
zip = FALSE,
prompt_filename = "README_FOR_LLM.txt",
zip_filename = NULL,
sensitive = NULL,
sensitive_detect = TRUE,
sensitive_strategy = c("fake", "drop"),
normalize = FALSE
)List with paths: $data_paths (named), $schema_path, $readme_path (optional), $zip_path (optional), and $fake (data.frame).
A data.frame (or coercible) to mirror.
Number of rows in the fake dataset (default 30).
Privacy level: "low", "medium", or "high". Controls stricter defaults.
Which data files to write: any of "csv","rds","parquet".
Folder to write outputs. Default: tempdir().
Base file name (no extension). Example: "demo_bundle". This becomes files like "demo_bundle.csv", "demo_bundle.rds", etc.
Optional RNG seed for reproducibility.
Write a README_FOR_LLM.txt next to the data? Default TRUE.
Create a single zip archive containing data + schema + README? Default FALSE.
Name for the README file. Default "README_FOR_LLM.txt".
Optional custom name for the ZIP file (no path).
If NULL (default), it is derived as paste0(filename, ".zip"),
e.g. "demo_bundle.zip".
Character vector of column names to treat as sensitive (optional).
Logical, auto-detect common sensitive columns (id/email/phone). Default TRUE.
"fake" (replace with realistic fakes) or "drop". Default "fake".
Logical; if TRUE, attempt light auto-normalization before faking.
Tips
Avoid using angle brackets in examples; prefer plain tokens like NAME
or FILE_NAME. If you truly want bracket glyphs, use Unicode ⟨name⟩ ⟩name⟨.