mlr_graphs_robustify
Robustify a learner
Creates a Graph
that can be used to robustify any subsequent learner.
Performs the following steps:
Drops empty factor levels using
PipeOpFixFactors
Imputes
numeric
features usingPipeOpImputeHist
andPipeOpMissInd
Imputes
factor
features usingPipeOpImputeOOR
Encodes
factors
usingone-hot-encoding
. Factors with a cardinality > max_cardinalityare collapsed using [
PipeOpCollapseFactors`].If
scaling
, numeric features are scaled to mean 0 and standard deviation 1.
The graph is built conservatively, i.e. the function always tries to assure everything works. If a learner is provided, some steps can be left out, i.e. if the learner can deal with factor variables, no encoding is performed.
Usage
pipeline_robustify(
task = NULL,
learner = NULL,
impute_missings = NULL,
factors_to_numeric = NULL,
max_cardinality = 1000
)
Arguments
- task
Task
ATask
to create a robustifying pipeline for. Optional, if omitted, the full pipeline is created.- learner
Learner
A learner to create a robustifying pipeline for. Optional, if omitted, a more conservative pipeline is built.- impute_missings
logical(1)
|NULL
Should missing values be imputed? Defaults toNULL
, i.e imputes if the task has missing values and the learner can not handle them.- factors_to_numeric
logical(1)
|NULL
Should factors be encoded? Defaults toNULL
, i.e encodes if the task has factors and the learner can not handle factors.- max_cardinality
integer(1)
Maximum number of factor levels allowed. See above. Default: 1000.
Value
Examples
# NOT RUN {
library(mlr3)
lrn = lrn("regr.rpart")
task = mlr_tasks$get("boston_housing")
gr = pipeline_robustify(task, lrn) %>>% po("learner", lrn)
resample(task, GraphLearner$new(gr), rsmp("holdout"))
# }