Learn R Programming

MachineShop (version 3.2.0)

rfe: Recursive Feature Elimination

Description

A wrapper method of backward feature selection in which a given model is fit to nested subsets of most important predictor variables in order to select the subset whose resampled predictive performance is optimal.

Usage

rfe(...)

# S3 method for formula rfe( formula, data, model, control = MachineShop::settings("control"), props = 4, sizes = integer(), random = FALSE, recompute = TRUE, optimize = c("global", "local"), samples = c(rfe = 1, varimp = 1), metrics = NULL, stat = "base::mean", ... )

# S3 method for matrix rfe( x, y, model, control = MachineShop::settings("control"), props = 4, sizes = integer(), random = FALSE, recompute = TRUE, optimize = c("global", "local"), samples = c(rfe = 1, varimp = 1), metrics = NULL, stat = "base::mean", ... )

# S3 method for ModelFrame rfe( input, model = NULL, control = MachineShop::settings("control"), props = 4, sizes = integer(), random = FALSE, recompute = TRUE, optimize = c("global", "local"), samples = c(rfe = 1, varimp = 1), metrics = NULL, stat = "base::mean", ... )

# S3 method for recipe rfe( input, model = NULL, control = MachineShop::settings("control"), props = 4, sizes = integer(), random = FALSE, recompute = TRUE, optimize = c("global", "local"), samples = c(rfe = 1, varimp = 1), metrics = NULL, stat = "base::mean", ... )

# S3 method for MLModel rfe(model, ...)

# S3 method for MLModelFunction rfe(model, ...)

Arguments

...

arguments passed from the generic function to its methods and from the MLModel and MLModelFunction methods to others. The first arguments of rfe methods are positional and, as such, must be given first in calls to them.

formula, data

formula defining the model predictor and response variables and a data frame containing them.

model

model function, function name, or object; or another object that can be coerced to a model. A model can be given first followed by any of the variable specifications, and the argument can be omitted altogether in the case of modeled inputs.

control

control function, function name, or object defining the resampling method to be employed.

props

numeric vector of the proportions of most important predictor variables to retain in fitted models or an integer number of equal spaced proportions to generate automatically; ignored if sizes are given.

sizes

integer vector of the set sizes of most important predictor variables to retain.

random

logical indicating whether to eliminate variables at random with probabilities proportional to their importance.

recompute

logical indicating whether to recompute variable importance after eliminating each set of variables.

optimize

character string specifying a search through all props to identify the globally optimal model ("global") or a search that stops after identifying the first locally optimal model ("local").

samples

numeric vector or list giving the number of permutation samples for each of the rfe and varimp algorithms. One or both of the values may be specified as named arguments or in the order in which their defaults appear. Larger numbers of samples decrease variability in estimated model performances and variable importances at the expense of increased computation time. Samples are more expensive computationally for rfe than for varimp.

metrics

metric function, function name, or vector of these with which to calculate performance. If not specified, default metrics defined in the performance functions are used.

stat

function or character string naming a function to compute a summary statistic on resampled metric values and permuted samples.

x, y

matrix and object containing predictor and response variables.

input

input object defining and containing the model predictor and response variables.

Value

A data frame with columns for the numbers of predictor variables retained (size), their names (terms), logical indicators to identify the optimal model (optimal), and associated predictive performances (performance).

See Also

varimp

Examples

Run this code
# NOT RUN {
## Requires prior installation of suggested package gbm to run

rfe(sale_amount ~ ., data = ICHomes, model = GBMModel)
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab