mlr3filters v0.1.1

0

Monthly downloads

0th

Percentile

Filter Based Feature Selection for 'mlr3'

Extends 'mlr3' with filter methods for feature selection. Besides standalone filter methods built-in methods of any machine-learning algorithm are supported. Partial scoring of multivariate filter methods is supported.

Readme

mlr3filters

mlr3filters adds filters, feature selection methods and embedded feature selection methods of algorithms to mlr3.

Travis build
status CRAN Status
Badge CRAN
checks Coverage
status StackOverflow

Installation

CRAN version

install.packages("mlr3filters")

Development version

remotes::install_github("mlr-org/mlr3filters")

Filters

Filter Example

library("mlr3")
library("mlr3filters")

task = tsk("pima")
filter = flt("auc")
as.data.table(filter$calculate(task))
##     feature     score
##      <char>     <num>
## 1:  glucose 0.2927906
## 2:  insulin 0.2316288
## 3:     mass 0.1870358
## 4:      age 0.1869403
## 5:  triceps 0.1625115
## 6: pregnant 0.1195149
## 7: pressure 0.1075760
## 8: pedigree 0.1062015

Implemented Filters

Name Task Type Feature Types Package
anova Classif Integer, Numeric stats
auc Classif Integer, Numeric mlr3measures
carscore Regr Numeric care
cmim Classif & Regr Integer, Numeric, Factor, Ordered praznik
correlation Regr Integer, Numeric stats
disr Classif Integer, Numeric, Factor, Ordered praznik
importance Universal Logical, Integer, Numeric, Character, Factor, Ordered rpart
information_gain Classif & Regr Integer, Numeric, Factor, Ordered FSelectorRcpp
jmi Classif Integer, Numeric, Factor, Ordered praznik
jmim Classif Integer, Numeric, Factor, Ordered praznik
kruskal_test Classif Integer, Numeric stats
mim Classif Integer, Numeric, Factor, Ordered praznik
mrmr Classif & Regr Numeric, Factor, Integer, Character, Logical praznik
njmim Classif Integer, Numeric, Factor, Ordered praznik
performance Universal Logical, Integer, Numeric, Character, Factor, Ordered rpart
variance Classif & Regr Integer, Numeric stats

Variable Importance Filters

The following learners allow the extraction of variable importance and therefore are supported by FilterImportance:

## [1] "classif.featureless" "classif.ranger"      "classif.rpart"      
## [4] "classif.xgboost"     "regr.featureless"    "regr.lm"            
## [7] "regr.ranger"         "regr.rpart"          "regr.xgboost"

If your learner is not listed here but capable of extracting variable importance from the fitted model, the reason is most likely that it is not yet integrated in the package mlr3learners or the extra learner organization. Please open an issue so we can add your package.

Some learners need to have their variable importance measure “activated” during learner creation. For example, to use the “impurity” measure of Random Forest via the ranger package:

task = tsk("iris")
lrn = lrn("classif.ranger")
lrn$param_set$values = list(importance = "impurity")

filter = flt("importance", learner = lrn)
filter$calculate(task)
head(as.data.table(filter), 3)
##         feature     score
##          <char>     <num>
## 1:  Petal.Width 44.588117
## 2: Petal.Length 42.501367
## 3: Sepal.Length  9.898418

Performance Filter

FilterPerformance is a univariate filter method which calls resample() with every predictor variable in the dataset and ranks the final outcome using the supplied measure. Any learner can be passed to this filter with classif.rpart being the default. Of course, also regression learners can be passed if the task is of type “regr”.

Functions in mlr3filters

Name Description
flt Syntactic Sugar for Filter Construction
FilterMIM Conditional Mutual Information Based Feature Selection Filter
FilterJMI Joint Mutual Information Filter
FilterMRMR Minimum redundancy maximal relevancy filter
as.data.table Re-export of as.data.table See data.table::as.data.table.
FilterVariance Variance Filter
FilterPerformance Predictive Performance Filter
FilterNJMIM Minimal Normalised Joint Mutual Information Maximisation Filter
mlr3filters-package mlr3filters: Filter Based Feature Selection for 'mlr3'
FilterKruskalTest Kruskal-Wallis Test Filter
mlr_filters Dictionary of Filters
FilterJMIM Minimal Joint Mutual Information Maximisation Filter
FilterAnova ANOVA F-Test Filter
Filter Filter Base Class
FilterImportance Filter for Embedded Feature Selection via Variable Importance
FilterDISR Double Input Symmetrical Relevance Filter
FilterCarScore Conditional Mutual Information Based Feature Selection Filter
FilterInformationGain Information Gain Filter
FilterAUC AUC Filter
FilterCorrelation Correlation Filter
FilterCMIM Minimal Conditional Mutual Information Filter
No Results!

Last month downloads

Details

License LGPL-3
URL https://mlr3filters.mlr-org.com, https://github.com/mlr-org/mlr3filters
BugReports https://github.com/mlr-org/mlr3filters/issues
Encoding UTF-8
NeedsCompilation no
RoxygenNote 7.0.2
Collate 'Filter.R' 'mlr_filters.R' 'FilterAUC.R' 'FilterAnova.R' 'FilterCMIM.R' 'FilterCarScore.R' 'FilterCorrelation.R' 'FilterDISR.R' 'FilterImportance.R' 'FilterInformationGain.R' 'FilterJMI.R' 'FilterJMIM.R' 'FilterKruskalTest.R' 'FilterMIM.R' 'FilterMRMR.R' 'FilterNJMIM.R' 'FilterPerformance.R' 'FilterVariance.R' 'flt.R' 'reexports.R' 'zzz.R'
Packaged 2019-12-08 13:25:47 UTC; patrickschratz
Repository CRAN
Date/Publication 2019-12-08 13:40:02 UTC

Include our badge in your README

[![Rdoc](http://www.rdocumentation.org/badges/version/mlr3filters)](http://www.rdocumentation.org/packages/mlr3filters)