Preprocess & create a model matrix with interactions + polynomials
sparseR_prep(
formula,
data,
k = 1,
poly = 1,
pre_proc_opts = c("knnImpute", "scale", "center", "otherbin", "none"),
ia_formula = NULL,
filter = c("nzv", "zv"),
extra_opts = list(),
family = "gaussian"
)
an object of class recipe
; see recipes::recipe()
A formula of the main effects + outcome of the model
A required data frame or tibble containing the variables in
formula
Maximum order of interactions to numeric variables
the maximum order of polynomials to consider
A character vector specifying methods for preprocessing (see details)
formula to be passed to step_interact (for interactions, see details)
which methods should be used to filter out variables with (near) zero variance? (see details)
extra options to be used for preprocessing
family passed from sparseR
The pre_proc_opts acts as a wrapper for the corresponding procedures in the
recipes
package. The currently supported options that can be passed to
pre_proc_opts are: knnImpute: Should k-nearest-neighbors be performed (if
necessary?) scale: Should variables be scaled prior to creating interactions
(does not scale factor variables or dummy variables) center: Should variables
be centered (will not center factor variables or dummy variables ) otherbin:
ia_formula
will by default interact all variables with each other up
to order k. If specified, ia_formula will be passed as the terms
argument
to recipes::step_interact
, so the help documentation for that function
can be investigated for further assistance in specifying specific
interactions.
The methods specified in filter are important; filtering is necessary to cut down on extraneous polynomials and interactions (in cases where they really don't make sense). This is true, for instance, when using dummy variables in polynomials , or when using interactions of dummy variables that relate to the same categorical variable.