Selection By Filtering (SBF) Helper Functions
Ancillary functions for univariate feature selection
anovaFilter(x, y, cut = 0.05) gamFilter(x, y, cut = 0.05)
caretSBF lmSBF rfSBF treebagSBF ldaSBF nbSBF
- a matrix or data frame of numeric predictors
- a numeric or factor vector of outcomes
- a p-value cut-off
This page documents the functions that are used in selection by filtering (SBF). The functions described here are passed to the algorithm via the
functions argument of
sbfControl for details on how these functions should be defined.
gamFilter are two examples of univariate filtering functions.
anovaFilter fits a simple linear model between a single feature and the outcome, then the p-value for the whole model F-test is generated. If the p-values is greater than 0.05, the feature is retained for the model.
gamFilter fits a generalized additive model between a single predictor and the outcome using a smoothing spline basis function. A p-value is generated using the whole model test from
summary.gam and p-values greater than 0.05 indicate that a predictor will be excluded.
If a particular model fails for
gam, the predictor is not used in the model.