mlr_graphs_robustify: Robustify a learner

Description

Creates a Graph that can be used to robustify any subsequent learner. Performs the following steps:

Drops empty factor levels using PipeOpFixFactors
Imputes numeric features using PipeOpImputeHist and PipeOpMissInd
Imputes factor features using PipeOpImputeOOR
Encodes factors using one-hot-encoding. Factors with a cardinality > max_cardinality are collapsed using [PipeOpCollapseFactors`].
If scaling, numeric features are scaled to mean 0 and standard deviation 1.

The graph is built conservatively, i.e. the function always tries to assure everything works. If a learner is provided, some steps can be left out, i.e. if the learner can deal with factor variables, no encoding is performed.

Usage

pipeline_robustify(
  task = NULL,
  learner = NULL,
  impute_missings = NULL,
  factors_to_numeric = NULL,
  max_cardinality = 1000
)

Arguments

task

Task A Task to create a robustifying pipeline for. Optional, if omitted, the full pipeline is created.

learner

Learner A learner to create a robustifying pipeline for. Optional, if omitted, a more conservative pipeline is built.

impute_missings

logical(1) | NULL Should missing values be imputed? Defaults to NULL, i.e imputes if the task has missing values and the learner can not handle them.

factors_to_numeric

logical(1) | NULL Should factors be encoded? Defaults to NULL, i.e encodes if the task has factors and the learner can not handle factors.

max_cardinality

integer(1) Maximum number of factor levels allowed. See above. Default: 1000.

Value

Graph

Examples

Run this code

# NOT RUN {
library(mlr3)
lrn = lrn("regr.rpart")
task = mlr_tasks$get("boston_housing")
gr = pipeline_robustify(task, lrn) %>>% po("learner", lrn)
resample(task, GraphLearner$new(gr), rsmp("holdout"))
# }

Run the code above in your browser using DataLab