# mlr_pipeops_colapply

##### PipeOpColApply

Applies a function to each column of a task. Use the `affect_columns`

parameter inherited from
`PipeOpTaskPreproc`

to limit the columns this function should be applied to. This can be used
for simple parameter transformations or type conversions (e.g. `as.numeric`

).

The same function is applied during training and prediction. One important relationship for
machine learning preprocessing is that during the prediction phase, the preprocessing on each
data row should be independent of other rows. Therefore, the `applicator`

function should always
return a vector / list where each result component only depends on the corresponding input component and
not on other components. As a rule of thumb, if the function `f`

generates output different
from `Vectorize(f)`

, it is not a function that should be used for `applicator`

.

##### Format

`R6Class`

object inheriting from `PipeOpTaskPreproc`

/`PipeOp`

.

##### Construction

PipeOpColApply$new(id = "colapply", param_vals = list())

`id`

::`character(1)`

Identifier of resulting object, default`"colapply"`

.`param_vals`

:: named`list`

List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default`list()`

.

##### Input and Output Channels

Input and output channels are inherited from `PipeOpTaskPreproc`

.

The output is the input `Task`

with features changed according to the `applicator`

parameter.

##### State

The `$state`

is a named `list`

with the `$state`

elements inherited from `PipeOpTaskPreproc`

, as well as:

`emptydt`

::`data.table`

An empty`data.table`

with columns of names and types from*output*features after training. This is used to produce a correct type conversion during prediction, even when the input has zero length and`applicator`

is therefore not called.

##### Parameters

The parameters are the parameters inherited from `PipeOpTaskPreproc`

, as well as:

`applicator`

::`function`

Function to apply to each column of the task. The return value must have the same length as the input, i.e. vectorize over the input. A typical example would be`as.numeric`

. Use`Vectorize`

to create a vectorizing function from any function that ordinarily only takes one element input. The`applicator`

is not called during prediction if the input task has no rows; instead the types of affected features are changed to the result types of the`applicator`

call during training. Initialized to the`identity()`

-function.

##### Internals

`PipeOpColApply`

can not inherit from `PipeOpTaskPreprocSimple`

, because if `applicator`

is given
and the prediction data has 0 rows, then the resulting `data.table`

does not know
what the column types should be. Column type conformity between training and prediction is enforced
by simply saving a copy of an empty `data.table`

in the `$state$emptydt`

slot.

##### Fields

Only fields inherited from `PipeOpTaskPreproc`

/`PipeOp`

.

##### Methods

Only methods inherited from `PipeOpTaskPreprocSimple`

/`PipeOpTaskPreproc`

/`PipeOp`

.

##### See Also

Other PipeOps:
`PipeOpEnsemble`

,
`PipeOpImpute`

,
`PipeOpTargetTrafo`

,
`PipeOpTaskPreprocSimple`

,
`PipeOpTaskPreproc`

,
`PipeOp`

,
`mlr_pipeops_boxcox`

,
`mlr_pipeops_branch`

,
`mlr_pipeops_chunk`

,
`mlr_pipeops_classbalancing`

,
`mlr_pipeops_classifavg`

,
`mlr_pipeops_classweights`

,
`mlr_pipeops_collapsefactors`

,
`mlr_pipeops_colroles`

,
`mlr_pipeops_copy`

,
`mlr_pipeops_datefeatures`

,
`mlr_pipeops_encodeimpact`

,
`mlr_pipeops_encodelmer`

,
`mlr_pipeops_encode`

,
`mlr_pipeops_featureunion`

,
`mlr_pipeops_filter`

,
`mlr_pipeops_fixfactors`

,
`mlr_pipeops_histbin`

,
`mlr_pipeops_ica`

,
`mlr_pipeops_imputeconstant`

,
`mlr_pipeops_imputehist`

,
`mlr_pipeops_imputelearner`

,
`mlr_pipeops_imputemean`

,
`mlr_pipeops_imputemedian`

,
`mlr_pipeops_imputemode`

,
`mlr_pipeops_imputeoor`

,
`mlr_pipeops_imputesample`

,
`mlr_pipeops_kernelpca`

,
`mlr_pipeops_learner`

,
`mlr_pipeops_missind`

,
`mlr_pipeops_modelmatrix`

,
`mlr_pipeops_multiplicityexply`

,
`mlr_pipeops_multiplicityimply`

,
`mlr_pipeops_mutate`

,
`mlr_pipeops_nmf`

,
`mlr_pipeops_nop`

,
`mlr_pipeops_ovrsplit`

,
`mlr_pipeops_ovrunite`

,
`mlr_pipeops_pca`

,
`mlr_pipeops_proxy`

,
`mlr_pipeops_quantilebin`

,
`mlr_pipeops_randomprojection`

,
`mlr_pipeops_randomresponse`

,
`mlr_pipeops_regravg`

,
`mlr_pipeops_removeconstants`

,
`mlr_pipeops_renamecolumns`

,
`mlr_pipeops_replicate`

,
`mlr_pipeops_scalemaxabs`

,
`mlr_pipeops_scalerange`

,
`mlr_pipeops_scale`

,
`mlr_pipeops_select`

,
`mlr_pipeops_smote`

,
`mlr_pipeops_spatialsign`

,
`mlr_pipeops_subsample`

,
`mlr_pipeops_targetinvert`

,
`mlr_pipeops_targetmutate`

,
`mlr_pipeops_targettrafoscalerange`

,
`mlr_pipeops_textvectorizer`

,
`mlr_pipeops_threshold`

,
`mlr_pipeops_tunethreshold`

,
`mlr_pipeops_unbranch`

,
`mlr_pipeops_updatetarget`

,
`mlr_pipeops_vtreat`

,
`mlr_pipeops_yeojohnson`

,
`mlr_pipeops`

##### Examples

```
# NOT RUN {
library("mlr3")
task = tsk("iris")
poca = po("colapply", applicator = as.character)
poca$train(list(task))[[1]] # types are converted
# function that does not vectorize
f = function(x) {
# we could use `ifelse` here, but that is not the point
if (x > 1) {
"a"
} else {
"b"
}
}
poca$param_set$values$applicator = Vectorize(f)
poca$train(list(task))[[1]]$data()
# only affect Petal.* columns:
poca$param_set$values$affect_columns = selector_grep("^Petal")
poca$train(list(task))[[1]]$data()
# }
```

*Documentation reproduced from package mlr3pipelines, version 0.3.0, License: LGPL-3*