Generate Synthetic Data using Bootstrap with Perturbation
generate_bootstrap_synthetic(
data,
continuous_vars,
cat_vars,
n = NULL,
seed = 123,
noise_level = 0.1,
id_var = NULL,
cat_flip_prob = NULL,
preserve_bounds = TRUE,
ordinal_vars = NULL
)A data frame with synthetic data
Original dataset to bootstrap from
Character vector of continuous variable names
Character vector of categorical variable names
Number of synthetic observations to generate (default: same as original)
Random seed for reproducibility
Noise level for perturbation (0 to 1, default 0.1)
Optional name of ID variable to regenerate (will be numbered 1:n)
Probability of flipping categorical values (default: noise_level/2)
Logical: should continuous variables stay within original bounds? (default: TRUE)
Optional character vector of ordinal categorical variables (these will be perturbed to adjacent values rather than randomly flipped)