⚠️There's a newer version (0.8.0) of this package. Take me there.

mlr3filters

Package website: release | dev

{mlr3filters} adds filters, feature selection methods and embedded feature selection methods of algorithms to {mlr3}.

Installation

CRAN version

install.packages("mlr3filters")

Development version

remotes::install_github("mlr-org/mlr3filters")

Filters

Filter Example

set.seed(1)
library("mlr3")
library("mlr3filters")

task = tsk("pima")
filter = flt("auc")
as.data.table(filter$calculate(task))
##     feature     score
## 1:  glucose 0.2927906
## 2:  insulin 0.2316288
## 3:     mass 0.1870358
## 4:      age 0.1869403
## 5:  triceps 0.1625115
## 6: pregnant 0.1195149
## 7: pressure 0.1075760
## 8: pedigree 0.1062015

Implemented Filters

NameTask TypeFeature TypesPackage
anovaClassifInteger, Numericstats
aucClassifInteger, Numericmlr3measures
carscoreRegrNumericcare
cmimClassif & RegrInteger, Numeric, Factor, Orderedpraznik
correlationRegrInteger, Numericstats
disrClassifInteger, Numeric, Factor, Orderedpraznik
find_correlationClassif & RegrInteger, Numericstats
importanceUniversalLogical, Integer, Numeric, Factor, Orderedrpart
information_gainClassif & RegrInteger, Numeric, Factor, OrderedFSelectorRcpp
jmiClassifInteger, Numeric, Factor, Orderedpraznik
jmimClassifInteger, Numeric, Factor, Orderedpraznik
kruskal_testClassifInteger, Numericstats
mimClassifInteger, Numeric, Factor, Orderedpraznik
mrmrClassifInteger, Numeric, Factor, Orderedpraznik
njmimClassifInteger, Numeric, Factor, Orderedpraznik
performanceUniversalLogical, Integer, Numeric, Factor, Ordered
permutationUniversalLogical, Integer, Numeric, Factor, Ordered
varianceClassif & RegrInteger, Numericstats

Variable Importance Filters

The following learners allow the extraction of variable importance and therefore are supported by FilterImportance:

## [1] "classif.featureless" "classif.ranger"      "classif.rpart"      
## [4] "classif.xgboost"     "regr.featureless"    "regr.ranger"        
## [7] "regr.rpart"          "regr.xgboost"

If your learner is not listed here but capable of extracting variable importance from the fitted model, the reason is most likely that it is not yet integrated in the package mlr3learners or the extra learner organization. Please open an issue so we can add your package.

Some learners need to have their variable importance measure “activated” during learner creation. For example, to use the “impurity” measure of Random Forest via the {ranger} package:

task = tsk("iris")
lrn = lrn("classif.ranger")
lrn$param_set$values = list(importance = "impurity")

filter = flt("importance", learner = lrn)
filter$calculate(task)
head(as.data.table(filter), 3)
##         feature    score
## 1:  Petal.Width 43.66496
## 2: Petal.Length 43.10837
## 3: Sepal.Length 10.21944

Performance Filter

FilterPerformance is a univariate filter method which calls resample() with every predictor variable in the dataset and ranks the final outcome using the supplied measure. Any learner can be passed to this filter with classif.rpart being the default. Of course, also regression learners can be passed if the task is of type “regr”.

Copy Link

Version

Down Chevron

Install

install.packages('mlr3filters')

Monthly Downloads

4,270

Version

0.3.0

License

LGPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Last Published

July 18th, 2020

Functions in mlr3filters (0.3.0)