Learn R Programming

vazul (version 1.1.0)

scramble_variables: Scrambling the content of several variables in a data frame

Description

Scramble the values of several selected variables in a data frame simultaneously. Supports independent scrambling, joint scrambling, and within-group scrambling.

Usage

scramble_variables(
  data,
  ...,
  .groups = NULL,
  .together = FALSE,
  .byrow = FALSE
)

Value

A data frame with the specified columns scrambled. If grouping is specified, scrambling is done within each group.

Arguments

data

a data frame

...

Columns to scramble using tidyselect semantics. Each can be:

  • Bare column names (e.g., var1, var2)

  • A tidyselect expression (e.g., starts_with("treat_"))

  • A character vector of column names (e.g., c("var1", "var2"))

  • Multiple sets can be provided as separate arguments

.groups

Optional grouping columns. Scrambling will be done within each group. Supports the same tidyselect syntax as column selection. Grouping columns must not overlap with the columns selected in .... If data is already a grouped dplyr data frame, existing grouping is ignored unless .groups is explicitly provided. Ignored if .byrow = TRUE.

.together

logical. If TRUE, variables are scrambled together as a unit per row. Values across different variables are kept intact but assigned to different rows. If FALSE (default), each variable is scrambled independently.

.byrow

logical. If TRUE, values are scrambled rowwise across the selected columns. For each row, the values in the selected columns are shuffled among themselves. This requires selected columns to have compatible types. Cannot be combined with .together = TRUE.

See Also

scramble_values for scrambling a single vector.

Examples

Run this code
df <- data.frame(
  x = 1:6,
  y = letters[1:6],
  group = c("A", "A", "A", "B", "B", "B")
)

set.seed(123)
# Example without grouping. Variables scrambled across the entire data frame.
# Using bare names
df |> scramble_variables(x, y)
# Or using character vector
df |> scramble_variables(c("x", "y"))

# Example with .together = TRUE. Variables scrambled together as a unit per row.
df |> scramble_variables(c("x", "y"), .together = TRUE)

# Example with grouping. Variable only scrambled within groups.
df |> scramble_variables("y", .groups = "group")

# Example combining grouping and together parameters
df |> scramble_variables(c("x", "y"), .groups = "group", .together = TRUE)

# Example with tidyselect helpers
library(dplyr)
df |> scramble_variables(starts_with("x"))
df |> scramble_variables(where(is.numeric), .groups = "group")

# Example with the 'williams' dataset
data(williams)
williams |> scramble_variables(c("ecology", "age"))
williams |> scramble_variables(1:5)
williams |> scramble_variables(c("ecology", "age"), .groups = "gender")
williams |> scramble_variables(c(1, 2), .groups = 3)
williams |> scramble_variables(c("ecology", "age"), .together = TRUE)
williams |> scramble_variables(c("ecology", "age"), .groups = "gender", .together = TRUE)

# Rowwise scrambling
df_row <- data.frame(a = 1:3, b = 4:6, c = 7:9)
df_row |> scramble_variables(a, b, c, .byrow = TRUE)

Run the code above in your browser using DataLab