Learn R Programming

vazul (version 1.1.0)

mask_variables: Mask categorical variables with random labels in a data frame

Description

Applies masked labels to multiple categorical variables in a data frame using the mask_labels() function. Each variable gets independent random masked labels by default, or can optionally use the same masked labels across all selected variables.

Usage

mask_variables(data, ..., .across_variables = FALSE)

Value

A data frame with the specified categorical columns masked. Only character and factor columns can be processed.

Arguments

data

a data frame

...

Columns to mask using tidyselect semantics. Each can be:

  • Bare column names (e.g., var1, var2)

  • A tidyselect expression (e.g., starts_with("treat_"))

  • A character vector of column names (e.g., c("var1", "var2"))

  • Multiple sets can be provided as separate arguments

Only character and factor columns will be processed.

.across_variables

logical. If TRUE, all selected variables will use the same set of masked labels. If FALSE (default), each variable gets its own independent set of masked labels using the column name as prefix.

See Also

mask_labels for masking a single vector, mask_names for masking variable names.

Examples

Run this code

# Create example data
df <- data.frame(
  treatment = c("control", "intervention", "control"),
  outcome = c("success", "failure", "success"),
  score = c(1, 2, 3)  # numeric, won't be masked
)

set.seed(123)
# Independent masking for each variable (default - uses column names as
# prefixes)
# Using bare names
mask_variables(df, treatment, outcome)
# Or using character vector
mask_variables(df, c("treatment", "outcome"))

set.seed(456)
# Shared masking across variables
mask_variables(df, c("treatment", "outcome"), .across_variables = TRUE)

# Using tidyselect helpers
mask_variables(df, where(is.character))

# Example with multiple categorical columns
df2 <- data.frame(
  group = c("A", "B", "A", "B"),
  condition = c("ctrl", "test", "ctrl", "test")
)
set.seed(123)
result <- mask_variables(df2, c("group", "condition"))
print(result)

# Example with williams dataset (multiple categorical columns)
data(williams)
set.seed(456)
# Using bare names (recommended for interactive use)
williams_masked <- mask_variables(williams, subject, ecology)
head(williams_masked[c("subject", "ecology")])

Run the code above in your browser using DataLab