Provides an interface for deriving sparse prediction ensembles where basis functions are selected through L1 penalization.
gpe(formula, data, base_learners = list(gpe_trees(), gpe_linear()),
weights = rep(1, times = nrow(data)), sample_func = gpe_sample(),
verbose = FALSE, penalized_trainer = gpe_cv.glmnet(), model = TRUE)Symbolic description of the model to be fit of the form y ~ x1 + x2 + ...+ xn. If the output variable (left-hand side of the formula) is a factor, an ensemble for binary classification is created. Otherwise, an ensemble for prediction of a continuous variable is created.
data.frame containing the variables in the model.
List of functions which has formal arguments formula, data, weights, sample_func, verbose and family and returns a vector of characters with terms for the final formula passed to cv.glmnet. See gpe_linear, gpe_trees, and gpe_earth.
Case weights with length equal to number of rows in data.
Function used to sample when learning with base learners. The function should have formal argument n and weights and return a vector of indices. See gpe_sample.
TRUE if comments should be posted throughout the computations.
Function with formal arguments x, y, weights, family which returns a fit object. This can be changed to test other "penalized trainers" (like other function that perform an L1 penalty or L2 penalty and elastic net penalty). Not using cv.glmnet may cause other function for gpe objects to fail. See gpe_cv.glmnet.
TRUE if the data should added to the returned object.
An object of class gpe.
Provides a more general framework for making a sparse prediction ensemble than pre. A similar fit to pre can be estimated with the following call:
gpe(formula = y ~ x1 + x2 + x3, data = data, base_learners = list(gpe_linear(), gpe_trees()))
Products of hinge functions using MARS can be added to the ensemble above with the following call:
gpe(formula = y ~ x1 + x2 + x3, data = data, base_learners = list(gpe_linear(), gpe_trees(), gpe_earth))
Other customs base learners can be implemented. See gpe_trees, gpe_linear or gpe_earth for details of the setup. The sampling function given by sample_func can also be replaced by a custom sampling function. See gpe_sample for details of the setup.
Friedman, J. H., & Popescu, B. E. (2008). Predictive learning via rule ensembles. The Annals of Applied Statistics The Annals of Applied Statistics, 2(3), 916-954.
pre, gpe_trees, gpe_linear, gpe_earth, gpe_sample, gpe_cv.glmnet