Filter
Filter Base Class
Base class for filters. Predefined filters are stored in the dictionary mlr_filters. A Filter calculates a score for each feature of a task. Important features get a large value and unimportant features get a small value. Note that filter scores may also be negative.
- Keywords
- datasets
Format
R6::R6Class object.
Construction
f = Filter$new(id, task_type, param_set, param_vals, feature_types, packages)
id
::character(1)
Identifier for the filter.task_type
::character()
Types of the task the filter can operator on. E.g.,"classif"
or"regr"
.param_set
:: paradox::ParamSet Set of hyperparameters.param_vals
:: namedlist()
Named list of hyperparameter settings.feature_types
::character()
Feature types the filter operates on. Must be a subset ofmlr_reflections$task_feature_types
.task_properties
::character()
Required task properties, see mlr3::Task. Must be a subset ofmlr_reflections$task_properties
.packages
::character()
Set of required packages. Note that these packages will be loaded viarequireNamespace()
, and are not attached.
Fields
All arguments passed to the constructor are available as fields, and additionally:
scores
:: namednumeric()
Stores the calculated filter score values as named numeric vector. The vector is sorted in decreasing order with possibleNA
values last. Tied values (this includesNA
values) appear in a random, non-deterministic order.
Methods
calculate(task, nfeat = NULL)
(mlr3::Task,integer(1)
) ->self
Calculates the filter score values for the provided mlr3::Task and stores them in fieldscores
.nfeat
determines the minimum number of features to score (see "Partial Scoring"), and defaults to the number of features intask
. Loads required packages and then calls$calculate_internal()
. If the task has no rows, each feature gets the scoreNA
.calculate_internal(task, nfeat)
(mlr3::Task,integer(1)
) -> namednumeric()
Internal worker function. Each child class muss implement this method. Takes a task and the minimum number of features to score, and must return a named numeric with scores. The higher the score, the more important the feature. The calling function (calculate()
) ensures that the returned vector gets sorted and that missing feature scores get a score value ofNA
.
Partial Scoring
Some features support partial scoring of the feature set:
If nfeat
is not NULL
, only the best nfeat
features are guaranteed to get a score.
Additional features may be ignored for computational reasons, and then get a score value of NA
.
See Also
Other Filter: FilterAUC
,
FilterAnova
, FilterCMIM
,
FilterCarScore
,
FilterCorrelation
,
FilterDISR
, FilterImportance
,
FilterInformationGain
,
FilterJMIM
, FilterJMI
,
FilterKruskalTest
, FilterMIM
,
FilterMRMR
, FilterNJMIM
,
FilterPerformance
,
FilterVariance
, mlr_filters