This is a CPOConstructor
to be used to create a
CPO
. It is called like any R function and returns
the created CPO
.
Apply a given function to the target column of a regression Task
.
cpoApplyFunRegrTarget(
trafo,
invert.response = NULL,
invert.se = NULL,
param = NULL,
vectorize = TRUE,
gauss.points = 23,
id,
export = "export.default",
affect.type = NULL,
affect.index = integer(0),
affect.names = character(0),
affect.pattern = NULL,
affect.invert = FALSE,
affect.pattern.ignore.case = FALSE,
affect.pattern.perl = FALSE,
affect.pattern.fixed = FALSE
)
[CPO
].
[function
]
A function transforming the target column. If vectorize
is TRUE
,
the argument is a vector of the whole column, trafo
must vectorize over it
and return a vector of the same length; otherwise, the function gets called once
for every data item, and both the function argument and the return value
must have length 1.
The function must take one or two arguments. If it takes two arguments, the second argument
will be param
.
[function
]
If a model is trained on data that was transformed by trafo
, this function
should invert a prediction made by this model back to the space of the original data.
In most cases, this will be the inverse of trafo
, so that invert.response(trafo(x)) == x
.
Similarly to trafo
, this function takes / produces single elements or the whole
column, depending on vectorize
. The return value should be a numeric
in both cases.
This can also be NULL
, in which case using this CPO
for
invert
with predict.type = "response"
is not possible.
Default is NULL
.
[function
]
Similarly to invert.response
, this is a function that inverts a "se"
prediction made after training on trafo
'd data. This function should take
at least two arguments, mean
and se
, and return a numeric vector of length
2 if vectorize
is FALSE
, or a data.frame
or matrix
with
two numeric columns if vectorize
is TRUE
. The function may also take a third
argument, which will be set to param
.
invert.se
may also be NULL
, in which case “se” inversion is done
by numeric integration using Gauss-Hermite quadrature.
Default is NULL
.
[any]
Optional argument to be given to trafo
and / or invert
. If both of
them only take one argument, this is ignored. Default is NULL
.
[logical(1)
]
Whether to call trafo
, invert.response
and invert.se
once
with the whole data column (or response and se column if predict.type == "se"
),
or once for each element. If the functions vectorize, it is recommended to have this
set to TRUE
for better performance. Default is TRUE
.
[numeric(1)
]
Number of points at which to evaluate invert.response
for Gauss-Hermite quadrature integration.
Only used if invert.se
is NULL
. Default is 23
.
[character(1)
]
id to use as prefix for the CPO's hyperparameters. this
must be used to avoid name clashes when composing two
CPOs of the same type, or with learners or other CPOS
with hyperparameters with clashing names.
[character
]
Either a character vector indicating the parameters to
export as hyperparameters, or one of the special values
“export.all” (export all parameters),
“export.default” (export all parameters that are exported by default),
“export.set” (export all parameters that were set during construction),
“export.default.set” (export the intersection of the “default” and “set” parameters),
“export.unset” (export all parameters that were not set during construction) or
“export.default.unset” (export the intersection of the “default” and “unset” parameters).
Default is “export.default”.
[character
| NULL
]
Type of columns to affect. A subset of “numeric”, “factor”, “ordered”, “other”, or NULL
to not match by column type. Default is NULL
.
[numeric
]
Indices of feature columns to affect. The order of indices given is respected. Target column indices are not counted
(since target columns are always included). Default is integer(0)
.
[character
]
Feature names of feature columns to affect. The order of names given is respected. Default is character(0)
.
[character(1)
| NULL
]
grep
pattern to match feature names by. Default is NULL
(no pattern matching)
[logical(1)
]
Whether to affect all features not matched by other affect.*
parameters.
[logical(1)
]
Ignore case when matching features with affect.pattern
; see grep
. Default is FALSE
.
[logical(1)
]
Use Perl-style regular expressions for affect.pattern
; see grep
. Default is FALSE
.
[logical(1)
]
Use fixed matching instead of regular expressions for affect.pattern
; see grep
. Default is FALSE
.
This function creates a CPO object, which can be applied to
Task
s, data.frame
s, link{Learner}
s
and other CPO objects using the %>>%
operator.
The parameters of this object can be changed after creation
using the function setHyperPars
. The other
hyper-parameter manipulating functins, getHyperPars
and getParamSet
similarly work as one expects.
If the “id” parameter is given, the hyperparameters will have this id as aprefix; this will, however, not change the parameters of the creator function.
CPO constructor functions are called with optional values of parameters, and additional “special” optional values.
The special optional values are the id
parameter, and the affect.*
parameters. The affect.*
parameters
enable the user to control which subset of a given dataset is affected. If no affect.*
parameters are given, all
data features are affected by default.
When both mean
and se
prediction is available, it may be possible to
make more accurate mean inversion than for the response
predict.type
,
using integrals or approximations like the delta method. In such cases it may be
advisable to prepend this CPO
with the cpoResponseFromSE
CPO
.
Note when trafo
or invert.response
take more than one argument, the
second argument will be set to the value of param
. This may lead to unexpected
results when using functions with rarely used parameters, e.g. log
.
In these cases, it may be necessary to wrap the function:
trafo = function(x) log(x)
.
Other CPOs:
cpoApplyFun()
,
cpoAsNumeric()
,
cpoCache()
,
cpoCbind()
,
cpoCollapseFact()
,
cpoDropConstants()
,
cpoDropMostlyConstants()
,
cpoDummyEncode()
,
cpoFilterAnova()
,
cpoFilterCarscore()
,
cpoFilterChiSquared()
,
cpoFilterFeatures()
,
cpoFilterGainRatio()
,
cpoFilterInformationGain()
,
cpoFilterKruskal()
,
cpoFilterLinearCorrelation()
,
cpoFilterMrmr()
,
cpoFilterOneR()
,
cpoFilterPermutationImportance()
,
cpoFilterRankCorrelation()
,
cpoFilterRelief()
,
cpoFilterRfCImportance()
,
cpoFilterRfImportance()
,
cpoFilterRfSRCImportance()
,
cpoFilterRfSRCMinDepth()
,
cpoFilterSymmetricalUncertainty()
,
cpoFilterUnivariate()
,
cpoFilterVariance()
,
cpoFixFactors()
,
cpoIca()
,
cpoImpactEncodeClassif()
,
cpoImpactEncodeRegr()
,
cpoImputeConstant()
,
cpoImputeHist()
,
cpoImputeLearner()
,
cpoImputeMax()
,
cpoImputeMean()
,
cpoImputeMedian()
,
cpoImputeMin()
,
cpoImputeMode()
,
cpoImputeNormal()
,
cpoImputeUniform()
,
cpoImpute()
,
cpoLogTrafoRegr()
,
cpoMakeCols()
,
cpoMissingIndicators()
,
cpoModelMatrix()
,
cpoOversample()
,
cpoPca()
,
cpoProbEncode()
,
cpoQuantileBinNumerics()
,
cpoRegrResiduals()
,
cpoResponseFromSE()
,
cpoSample()
,
cpoScaleMaxAbs()
,
cpoScaleRange()
,
cpoScale()
,
cpoSelect()
,
cpoSmote()
,
cpoSpatialSign()
,
cpoTransformParams()
,
cpoWrap()
,
makeCPOCase()
,
makeCPOMultiplex()