Provides an interface for deriving sparse prediction ensembles where basis functions are selected through L1 penalization.
gpe(formula, data, base_learners = list(gpe_trees(), gpe_linear()),
weights = rep(1, times = nrow(data)), sample_func = gpe_sample(),
verbose = FALSE, penalized_trainer = gpe_cv.glmnet(), model = TRUE)
Symbolic description of the model to be fit of the form y ~ x1 + x2 + ...+ xn
. If the output variable (left-hand side of the formula) is a factor, an ensemble for binary classification is created. Otherwise, an ensemble for prediction of a continuous variable is created.
data.frame
containing the variables in the model.
List of functions which has formal arguments formula, data, weights, sample_func, verbose
and family
and returns a vector of characters with terms for the final formula passed to cv.glmnet
. See gpe_linear
, gpe_trees
, and gpe_earth
.
Case weights with length equal to number of rows in data
.
Function used to sample when learning with base learners. The function should have formal argument n
and weights
and return a vector of indices. See gpe_sample
.
TRUE
if comments should be posted throughout the computations.
Function with formal arguments x, y, weights, family
which returns a fit object. This can be changed to test other "penalized trainers" (like other function that perform an L1 penalty or L2 penalty and elastic net penalty). Not using cv.glmnet
may cause other function for gpe
objects to fail. See gpe_cv.glmnet
.
TRUE
if the data
should added to the returned object.
An object of class gpe
.
Provides a more general framework for making a sparse prediction ensemble than pre
. A similar fit to pre
can be estimated with the following call:
gpe(formula = y ~ x1 + x2 + x3, data = data, base_learners = list(gpe_linear(), gpe_trees()))
Products of hinge functions using MARS can be added to the ensemble above with the following call:
gpe(formula = y ~ x1 + x2 + x3, data = data, base_learners = list(gpe_linear(), gpe_trees(), gpe_earth))
Other customs base learners can be implemented. See gpe_trees
, gpe_linear
or gpe_earth
for details of the setup. The sampling function given by sample_func
can also be replaced by a custom sampling function. See gpe_sample
for details of the setup.
Friedman, J. H., & Popescu, B. E. (2008). Predictive learning via rule ensembles. The Annals of Applied Statistics The Annals of Applied Statistics, 2(3), 916-954.
pre
, gpe_trees
, gpe_linear
, gpe_earth
, gpe_sample
, gpe_cv.glmnet