Learn R Programming

SVEMnet (version 3.2.0)

bigexp_prepare: Prepare data to match a bigexp_spec

Description

bigexp_prepare() coerces a new data frame so that it matches a previously built bigexp_terms spec. It:

  • applies the locked factor levels for categorical predictors,

  • enforces that continuous variables remain numeric (and errors if they are not), and

  • optionally warns about or errors on unseen factor levels.

Usage

bigexp_prepare(spec, data, unseen = c("warn_na", "error"))

Value

A list with two elements:

  • formula: the expanded formula stored in the spec (same as spec$formula).

  • data: a copy of the input data with predictor columns coerced to match the spec (types and levels), suitable for model.frame() / model.matrix().

Arguments

spec

Object returned by bigexp_terms.

data

New data frame (for example, training, test, or future batches).

unseen

How to handle unseen factor levels in data: "warn_na" (default) maps unseen levels to NA and issues a warning, or "error" stops with an error if any unseen levels are encountered.

Details

Columns that are not listed in spec$vars (for example, the response or extra metadata columns) are left unchanged.

The goal is that model.matrix(spec$formula, data) will produce the same set of columns in the same order across all datasets prepared with the same spec, even if some levels are missing in a particular batch.

See Also

bigexp_terms

Examples

Run this code
set.seed(1)
train <- data.frame(
  y  = rnorm(10),
  X1 = rnorm(10),
  X2 = rnorm(10),
  G  = factor(sample(c("A", "B"), 10, replace = TRUE))
)

spec <- bigexp_terms(
  y ~ X1 + X2 + G,
  data             = train,
  factorial_order  = 2,
  polynomial_order = 2
)

newdata <- data.frame(
  y  = rnorm(5),
  X1 = rnorm(5),
  X2 = rnorm(5),
  G  = factor(sample(c("A", "B"), 5, replace = TRUE))
)

prep <- bigexp_prepare(spec, newdata)
str(prep$data)

Run the code above in your browser using DataLab