Please see vignette("MultiClassVtreat", package = "vtreat")
https://winvector.github.io/vtreat/articles/MultiClassVtreat.html.
mkCrossFrameMExperiment(d, vars, y_name, ..., weights = c(),
minFraction = 0.02, smFactor = 0, rareCount = 0, rareSig = 1,
collarProb = 0, codeRestriction = NULL, customCoders = NULL,
scale = FALSE, doCollar = FALSE, splitFunction = NULL,
ncross = 3, forceSplit = FALSE, catScaling = FALSE,
y_dependent_treatments = c("catB"), verbose = FALSE,
parallelCluster = NULL, use_parallel = TRUE)
data to learn from
character, vector of indpendent variable column names.
character, name of outcome column.
not used, declared to forced named binding of later arguments
optional training weights for each row
optional minimum frequency a categorical level must have to be converted to an indicator column.
optional smoothing factor for impact coding models.
optional integer, allow levels with this count or below to be pooled into a shared rare-level. Defaults to 0 or off.
optional numeric, suppress levels from pooling at this significance value greater. Defaults to NULL or off.
what fraction of the data (pseudo-probability) to collar data at if doCollar is set during prepare.multinomial_plan
.
what types of variables to produce (character array of level codes, NULL means no restriction).
map from code names to custom categorical variable encoding functions (please see https://github.com/WinVector/vtreat/blob/master/extras/CustomLevelCoders.md).
optional if TRUE replace numeric variables with regression ("move to outcome-scale").
optional if TRUE collar numeric variables by cutting off after a tail-probability specified by collarProb during treatment design.
(optional) see vtreat::buildEvalSets .
optional scalar>=2 number of cross-validation rounds to design.
logical, if TRUE force cross-validated significance calculations on all variables.
optional, if TRUE use glm() linkspace, if FALSE use lm() for scaling.
character what treatment types to build per-outcome level.
if TRUE print progress.
(optional) a cluster object created by package parallel or package snow.
logical, if TRUE use parallel methods.
list(cross_frame, treatments_0, treatments_m)