Add missing indicator columns ("dummy columns") to the Task.
Drops original features; should probably be used in combination with PipeOpFeatureUnion and imputation PipeOps (see examples).
Note the affect_columns is initialized with selector_invert(selector_type(c("factor", "ordered", "character"))), since missing
values in factorial columns are often indicated by out-of-range imputation (PipeOpImputeOOR).
R6Class object inheriting from PipeOpTaskPreprocSimple/PipeOpTaskPreproc/PipeOp.
PipeOpMissInd$new(id = "missind", param_vals = list())
id :: character(1)
Identifier of the resulting  object, defaulting to "missind".
param_vals :: named list
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default list().
$state is a named list with the $state elements inherited from PipeOpTaskPreproc, as well as:
indicand_cols :: character
Names of columns for which indicator columns are added. If the which parameter is "all", this is just the names of all features,
otherwise it is the names of all features that had missing values during training.
The parameters are the parameters inherited from the PipeOpTaskPreproc, as well as:
which :: character(1)
Determines for which features the indicator columns are added. Can either be "missing_train" (default), adding indicator columns
for each feature that actually has missing values, or "all", adding indicator columns for all features.
type :: character(1)
Determines the type of the newly created columns. Can be one of "factor" (default), "integer", "logical", "numeric".
This PipeOp should cover most cases where "dummy columns" or "missing indicators" are desired. Some edge cases:
If imputation
for factorial features is performed and only numeric features should gain missing indicators, the affect_columns parameter
can be set to selector_type("numeric").
If missing indicators should only be added for features that have more than a fraction of x missing values, the
PipeOpRemoveConstants can be used with affect_columns = selector_grep("^missing_") and ratio = x.
Fields inherited from PipeOpTaskPreproc/PipeOp.
Methods inherited from PipeOpTaskPreproc/PipeOp.
https://mlr3book.mlr-org.com/list-pipeops.html
Other PipeOps: 
PipeOpEnsemble,
PipeOpImpute,
PipeOpTargetTrafo,
PipeOpTaskPreprocSimple,
PipeOpTaskPreproc,
PipeOp,
mlr_pipeops_boxcox,
mlr_pipeops_branch,
mlr_pipeops_chunk,
mlr_pipeops_classbalancing,
mlr_pipeops_classifavg,
mlr_pipeops_classweights,
mlr_pipeops_colapply,
mlr_pipeops_collapsefactors,
mlr_pipeops_colroles,
mlr_pipeops_copy,
mlr_pipeops_datefeatures,
mlr_pipeops_encodeimpact,
mlr_pipeops_encodelmer,
mlr_pipeops_encode,
mlr_pipeops_featureunion,
mlr_pipeops_filter,
mlr_pipeops_fixfactors,
mlr_pipeops_histbin,
mlr_pipeops_ica,
mlr_pipeops_imputeconstant,
mlr_pipeops_imputehist,
mlr_pipeops_imputelearner,
mlr_pipeops_imputemean,
mlr_pipeops_imputemedian,
mlr_pipeops_imputemode,
mlr_pipeops_imputeoor,
mlr_pipeops_imputesample,
mlr_pipeops_kernelpca,
mlr_pipeops_learner,
mlr_pipeops_modelmatrix,
mlr_pipeops_multiplicityexply,
mlr_pipeops_multiplicityimply,
mlr_pipeops_mutate,
mlr_pipeops_nmf,
mlr_pipeops_nop,
mlr_pipeops_ovrsplit,
mlr_pipeops_ovrunite,
mlr_pipeops_pca,
mlr_pipeops_proxy,
mlr_pipeops_quantilebin,
mlr_pipeops_randomprojection,
mlr_pipeops_randomresponse,
mlr_pipeops_regravg,
mlr_pipeops_removeconstants,
mlr_pipeops_renamecolumns,
mlr_pipeops_replicate,
mlr_pipeops_scalemaxabs,
mlr_pipeops_scalerange,
mlr_pipeops_scale,
mlr_pipeops_select,
mlr_pipeops_smote,
mlr_pipeops_spatialsign,
mlr_pipeops_subsample,
mlr_pipeops_targetinvert,
mlr_pipeops_targetmutate,
mlr_pipeops_targettrafoscalerange,
mlr_pipeops_textvectorizer,
mlr_pipeops_threshold,
mlr_pipeops_tunethreshold,
mlr_pipeops_unbranch,
mlr_pipeops_updatetarget,
mlr_pipeops_vtreat,
mlr_pipeops_yeojohnson,
mlr_pipeops
# NOT RUN {
library("mlr3")
task = tsk("pima")$select(c("insulin", "triceps"))
sum(complete.cases(task$data()))
task$missings()
tail(task$data())
po = po("missind")
new_task = po$train(list(task))[[1]]
tail(new_task$data())
# proper imputation + missing indicators
impgraph = list(
  po("imputesample"),
  po("missind")
) %>>% po("featureunion")
tail(impgraph$train(task)[[1]]$data())
# }
Run the code above in your browser using DataLab