Applies a function to each column of a task. Use the `affect_columns`

parameter inherited from
`PipeOpTaskPreprocSimple`

to limit the columns this function should be applied to. This can be used
for simple parameter transformations or type conversions (e.g. `as.numeric`

).

The same function is applied during training and prediction. One important relationship for
machine learning preprocessing is that during the prediction phase, the preprocessing on each
data row should be independent of other rows. Therefore, the `applicator`

function should always
return a vector / list where each result component only depends on the corresponding input component and
not on other components. As a rule of thumb, if the function `f`

generates output different
from `Vectorize(f)`

, it is not a function that should be used for `applicator`

.

`R6Class`

object inheriting from `PipeOpTaskPreprocSimple`

/`PipeOpTaskPreproc`

/`PipeOp`

.

PipeOpColApply$new(id = "colapply", param_vals = list())

`id`

::`character(1)`

Identifier of resulting object, default`"colapply"`

.`param_vals`

:: named`list`

List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default`list()`

.

Input and output channels are inherited from `PipeOpTaskPreprocSimple`

.

The output is the input `Task`

with features changed according to the `applicator`

parameter.

The `$state`

is a named `list`

with the `$state`

elements inherited from `PipeOpTaskPreprocSimple`

.

The parameters are the parameters inherited from `PipeOpTaskPreprocSimple`

, as well as:

`applicator`

::`function`

Function to apply to each column of the task. The return value should be a`vector`

of the same length as the input, i.e., the function vectorizes over the input. A typical example would be`as.numeric`

. The return value can also be a`matrix`

,`data.frame`

, or`data.table`

. In this case, the length of the input must match the number of returned rows. The names of the resulting features of the output`Task`

is based on the (column) name(s) of the return value of the applicator function, prefixed with the original feature name separated by a dot (`.`

). Use`Vectorize`

to create a vectorizing function from any function that ordinarily only takes one element input.

Calls `map`

on the data, using the value of `applicator`

as `f.`

and coerces the output via `as.data.table`

.

Only fields inherited from `PipeOpTaskPreprocSimple`

/`PipeOpTaskPreproc`

/`PipeOp`

.

Only methods inherited from `PipeOpTaskPreprocSimple`

/`PipeOpTaskPreproc`

/`PipeOp`

.

https://mlr3book.mlr-org.com/list-pipeops.html

Other PipeOps:
`PipeOpEnsemble`

,
`PipeOpImpute`

,
`PipeOpTargetTrafo`

,
`PipeOpTaskPreprocSimple`

,
`PipeOpTaskPreproc`

,
`PipeOp`

,
`mlr_pipeops_boxcox`

,
`mlr_pipeops_branch`

,
`mlr_pipeops_chunk`

,
`mlr_pipeops_classbalancing`

,
`mlr_pipeops_classifavg`

,
`mlr_pipeops_classweights`

,
`mlr_pipeops_collapsefactors`

,
`mlr_pipeops_colroles`

,
`mlr_pipeops_copy`

,
`mlr_pipeops_datefeatures`

,
`mlr_pipeops_encodeimpact`

,
`mlr_pipeops_encodelmer`

,
`mlr_pipeops_encode`

,
`mlr_pipeops_featureunion`

,
`mlr_pipeops_filter`

,
`mlr_pipeops_fixfactors`

,
`mlr_pipeops_histbin`

,
`mlr_pipeops_ica`

,
`mlr_pipeops_imputeconstant`

,
`mlr_pipeops_imputehist`

,
`mlr_pipeops_imputelearner`

,
`mlr_pipeops_imputemean`

,
`mlr_pipeops_imputemedian`

,
`mlr_pipeops_imputemode`

,
`mlr_pipeops_imputeoor`

,
`mlr_pipeops_imputesample`

,
`mlr_pipeops_kernelpca`

,
`mlr_pipeops_learner`

,
`mlr_pipeops_missind`

,
`mlr_pipeops_modelmatrix`

,
`mlr_pipeops_multiplicityexply`

,
`mlr_pipeops_multiplicityimply`

,
`mlr_pipeops_mutate`

,
`mlr_pipeops_nmf`

,
`mlr_pipeops_nop`

,
`mlr_pipeops_ovrsplit`

,
`mlr_pipeops_ovrunite`

,
`mlr_pipeops_pca`

,
`mlr_pipeops_proxy`

,
`mlr_pipeops_quantilebin`

,
`mlr_pipeops_randomprojection`

,
`mlr_pipeops_randomresponse`

,
`mlr_pipeops_regravg`

,
`mlr_pipeops_removeconstants`

,
`mlr_pipeops_renamecolumns`

,
`mlr_pipeops_replicate`

,
`mlr_pipeops_scalemaxabs`

,
`mlr_pipeops_scalerange`

,
`mlr_pipeops_scale`

,
`mlr_pipeops_select`

,
`mlr_pipeops_smote`

,
`mlr_pipeops_spatialsign`

,
`mlr_pipeops_subsample`

,
`mlr_pipeops_targetinvert`

,
`mlr_pipeops_targetmutate`

,
`mlr_pipeops_targettrafoscalerange`

,
`mlr_pipeops_textvectorizer`

,
`mlr_pipeops_threshold`

,
`mlr_pipeops_tunethreshold`

,
`mlr_pipeops_unbranch`

,
`mlr_pipeops_updatetarget`

,
`mlr_pipeops_vtreat`

,
`mlr_pipeops_yeojohnson`

,
`mlr_pipeops`

# NOT RUN { library("mlr3") task = tsk("iris") poca = po("colapply", applicator = as.character) poca$train(list(task))[[1]] # types are converted # function that does not vectorize f1 = function(x) { # we could use `ifelse` here, but that is not the point if (x > 1) { "a" } else { "b" } } poca$param_set$values$applicator = Vectorize(f1) poca$train(list(task))[[1]]$data() # only affect Petal.* columns poca$param_set$values$affect_columns = selector_grep("^Petal") poca$train(list(task))[[1]]$data() # function returning multiple columns f2 = function(x) { cbind(floor = floor(x), ceiling = ceiling(x)) } poca$param_set$values$applicator = f2 poca$param_set$values$affect_columns = selector_all() poca$train(list(task))[[1]]$data() # }