mlrCPO (version 0.3.7-2)

cpoDropMostlyConstants: Drop Constant or Near-Constant Features

Description

This is a CPOConstructor to be used to create a CPO. It is called like any R function and returns the created CPO.

Drop all columns that are mostly constant: Constant within tolerance with numerics, and and columns that have only one value for factors or ordered columns.

This CPO can also filter “mostly” constant Features: ones where at most a fraction of ratio samples differ from the mode value.

Usage

cpoDropMostlyConstants(
  ratio = 0,
  rel.tol = 1e-08,
  abs.tol = 1e-08,
  ignore.na = FALSE,
  id,
  export = "export.default",
  affect.type = NULL,
  affect.index = integer(0),
  affect.names = character(0),
  affect.pattern = NULL,
  affect.invert = FALSE,
  affect.pattern.ignore.case = FALSE,
  affect.pattern.perl = FALSE,
  affect.pattern.fixed = FALSE
)

Arguments

ratio

[numeric(1)] Minimum ratio of values which must be different from the mode value in order to keep a feature in the task. Default is 0, which means only constant features with exactly one observed level are removed.

rel.tol

[numeric(1)] Relative tolerance within which to consider a feature constant. Set to 0 to disregard relative tolerance. Default is 1e-8.

abs.tol

[numeric(1)] Absolute tolerance within which to consider a feature constant. Set to 0 to disregard absolute tolerance. Default is 1e-8.

ignore.na

[logical(1)] Whether to ignore NA and NaN values. If this is TRUE, values that are NA or NaN will not be counted as different from any other value. If this is FALSE, columns with NA or NaN in them will only count as constant if they are entirely made up of NA, or entirely made up of NaN. Default is FALSE.

id

[character(1)] id to use as prefix for the CPO's hyperparameters. this must be used to avoid name clashes when composing two CPOs of the same type, or with learners or other CPOS with hyperparameters with clashing names.

export

[character] Either a character vector indicating the parameters to export as hyperparameters, or one of the special values “export.all” (export all parameters), “export.default” (export all parameters that are exported by default), “export.set” (export all parameters that were set during construction), “export.default.set” (export the intersection of the “default” and “set” parameters), “export.unset” (export all parameters that were not set during construction) or “export.default.unset” (export the intersection of the “default” and “unset” parameters). Default is “export.default”.

affect.type

[character | NULL] Type of columns to affect. A subset of “numeric”, “factor”, “ordered”, “other”, or NULL to not match by column type. Default is NULL.

affect.index

[numeric] Indices of feature columns to affect. The order of indices given is respected. Target column indices are not counted (since target columns are always included). Default is integer(0).

affect.names

[character] Feature names of feature columns to affect. The order of names given is respected. Default is character(0).

affect.pattern

[character(1) | NULL] grep pattern to match feature names by. Default is NULL (no pattern matching)

affect.invert

[logical(1)] Whether to affect all features not matched by other affect.* parameters.

affect.pattern.ignore.case

[logical(1)] Ignore case when matching features with affect.pattern; see grep. Default is FALSE.

affect.pattern.perl

[logical(1)] Use Perl-style regular expressions for affect.pattern; see grep. Default is FALSE.

affect.pattern.fixed

[logical(1)] Use fixed matching instead of regular expressions for affect.pattern; see grep. Default is FALSE.

Value

[CPO].

General CPO info

This function creates a CPO object, which can be applied to Tasks, data.frames, link{Learner}s and other CPO objects using the %>>% operator.

The parameters of this object can be changed after creation using the function setHyperPars. The other hyper-parameter manipulating functins, getHyperPars and getParamSet similarly work as one expects.

If the “id” parameter is given, the hyperparameters will have this id as aprefix; this will, however, not change the parameters of the creator function.

Calling a <code><a rd-options="" href="/link/CPOConstructor?package=mlrCPO&version=0.3.7-2" data-mini-rdoc="mlrCPO::CPOConstructor">CPOConstructor</a></code>

CPO constructor functions are called with optional values of parameters, and additional “special” optional values. The special optional values are the id parameter, and the affect.* parameters. The affect.* parameters enable the user to control which subset of a given dataset is affected. If no affect.* parameters are given, all data features are affected by default.

See Also

Other CPOs: cpoApplyFunRegrTarget(), cpoApplyFun(), cpoAsNumeric(), cpoCache(), cpoCbind(), cpoCollapseFact(), cpoDropConstants(), cpoDummyEncode(), cpoFilterAnova(), cpoFilterCarscore(), cpoFilterChiSquared(), cpoFilterFeatures(), cpoFilterGainRatio(), cpoFilterInformationGain(), cpoFilterKruskal(), cpoFilterLinearCorrelation(), cpoFilterMrmr(), cpoFilterOneR(), cpoFilterPermutationImportance(), cpoFilterRankCorrelation(), cpoFilterRelief(), cpoFilterRfCImportance(), cpoFilterRfImportance(), cpoFilterRfSRCImportance(), cpoFilterRfSRCMinDepth(), cpoFilterSymmetricalUncertainty(), cpoFilterUnivariate(), cpoFilterVariance(), cpoFixFactors(), cpoIca(), cpoImpactEncodeClassif(), cpoImpactEncodeRegr(), cpoImputeConstant(), cpoImputeHist(), cpoImputeLearner(), cpoImputeMax(), cpoImputeMean(), cpoImputeMedian(), cpoImputeMin(), cpoImputeMode(), cpoImputeNormal(), cpoImputeUniform(), cpoImpute(), cpoLogTrafoRegr(), cpoMakeCols(), cpoMissingIndicators(), cpoModelMatrix(), cpoOversample(), cpoPca(), cpoProbEncode(), cpoQuantileBinNumerics(), cpoRegrResiduals(), cpoResponseFromSE(), cpoSample(), cpoScaleMaxAbs(), cpoScaleRange(), cpoScale(), cpoSelect(), cpoSmote(), cpoSpatialSign(), cpoTransformParams(), cpoWrap(), makeCPOCase(), makeCPOMultiplex()