Filter

0th

Percentile

Filter Base Class

Base class for filters. Predefined filters are stored in the dictionary mlr_filters. A Filter calculates a score for each feature of a task. Important features get a large value and unimportant features get a small value. Note that filter scores may also be negative.

Keywords
datasets
Format

R6::R6Class object.

Construction

f = Filter$new(id, task_type, param_set, param_vals, feature_types, packages)
  • id :: character(1) Identifier for the filter.

  • task_type :: character() Types of the task the filter can operator on. E.g., "classif" or "regr".

  • param_set :: paradox::ParamSet Set of hyperparameters.

  • param_vals :: named list() Named list of hyperparameter settings.

  • feature_types :: character() Feature types the filter operates on. Must be a subset of mlr_reflections$task_feature_types.

  • task_properties :: character() Required task properties, see mlr3::Task. Must be a subset of mlr_reflections$task_properties.

  • packages :: character() Set of required packages. Note that these packages will be loaded via requireNamespace(), and are not attached.

Fields

All arguments passed to the constructor are available as fields, and additionally:

  • scores :: named numeric() Stores the calculated filter score values as named numeric vector. The vector is sorted in decreasing order with possible NA values last. Tied values (this includes NA values) appear in a random, non-deterministic order.

Methods

  • calculate(task, nfeat = NULL) (mlr3::Task, integer(1)) -> self Calculates the filter score values for the provided mlr3::Task and stores them in field scores. nfeat determines the minimum number of features to score (see "Partial Scoring"), and defaults to the number of features in task. Loads required packages and then calls $calculate_internal(). If the task has no rows, each feature gets the score NA.

  • calculate_internal(task, nfeat) (mlr3::Task, integer(1)) -> named numeric() Internal worker function. Each child class muss implement this method. Takes a task and the minimum number of features to score, and must return a named numeric with scores. The higher the score, the more important the feature. The calling function (calculate()) ensures that the returned vector gets sorted and that missing feature scores get a score value of NA.

Partial Scoring

Some features support partial scoring of the feature set: If nfeat is not NULL, only the best nfeat features are guaranteed to get a score. Additional features may be ignored for computational reasons, and then get a score value of NA.

See Also

Other Filter: FilterAUC, FilterAnova, FilterCMIM, FilterCarScore, FilterCorrelation, FilterDISR, FilterImportance, FilterInformationGain, FilterJMIM, FilterJMI, FilterKruskalTest, FilterMIM, FilterMRMR, FilterNJMIM, FilterPerformance, FilterVariance, mlr_filters

Aliases
  • Filter
Documentation reproduced from package mlr3filters, version 0.1.0, License: LGPL-3

Community examples

Looks like there are no examples yet.