quadrupen-package: Sparsity by Worst-Case Quadratic Penalties

Description

This package is designed to fit accurately several popular penalized linear regression models using the algorithm proposed in Grandvalet, Chiquet and Ambroise (submitted) by solving quadratic problems with increasing size.

Arguments

Features

At the moment, two R fitting functions are available:

the elastic.net function, which solves a family of linear regression problems penalized by a mixture of $l1$ and $l2$ norms. It notably includes the LASSO (Tibshirani, 1996), the adaptive-LASSO (Zou, 2006), the Elastic-net (Zou and Hastie, 2006) or the Structured Elastic-net (Slawski et al., 2010). See examples as well as the available demo(quad_enet).
the bounded.reg function, which fits a linear model penalized by a mixture of $infinity$ and $l2$ norms. It owns the same versatility as the elastic.net function regarding the $l2$ norm, yet the $l1$-norm is replaced by the infinity norm. Check demo(quad_breg) and examples.

The problem commonly solved for these two functions writes β^hat _λ₁,λ₂ = argmin_β 1/2 RSS(&beta) + λ₁ | D β |_q + λ/2 ₂ β^T S β, where $q=1$ for elastic.net and $q=infinity$ for bounded.reg. The diagonal matrix $D$ allows different weights for the first part of the penalty. The structuring matrix $S$ can be used to introduce some prior information regarding the predictors. It is provided via a positive semidefinite matrix. The S4 objects produced by the fitting procedures own the classical methods for linear model in R, as well as methods for plotting, (double) cross-validation and for the stability selection procedure of Meinshausen and Buhlmann (2010). All the examples of this documentation have been included to the package source, in the 'examples' directory. Some (too few!) routine testing scripts using the testhat package are also present in the 'tests' directory, where we check basic functionalities of the code, especially the reproducibility of the Lasso/Elastic-net solution path with the lars, elasticnet and glmnet packages. We also check the handling of runtime errors or unstabilities.

Algorithm

The general strategy of the algorithm relies on maintaining an active set of variables, starting from a vector of zeros. The underlying optimization problem is solved only on the activated variables, thus handling with small smooth problems with increasing size. Hence, by considering a decreasing grid of values for the penalty $lambda1$ and fixing $lambda2$, we may explore the whole path of solutions at a reasonable numerical cost, providing that $lambda1$ does not end up too small. For the $l1$-based methods (available in the elastic.net function), the size of the underlying problems solved is related to the number of nonzero coefficients in the vector of parameters. With the $infinity$-norm, (available in the boundary.reg function), we do not produce sparse estimator. Nevertheless, the size of the systems solved along the path deals with the number of unbounded variables for the current penalty level, which is quite smaller than the number of predictors for a reasonable $lambda1$. The same kind of proposal was made in Zhao, Rocha and Yu (2009). Underlying optimization is performed by direct resolution of quadratic sub problems, which is the main purpose of this package. This strategy is thoroughly exposed in Grandvalet, Chiquet and Ambroise (submitted). Still, we also implemented the popular and versatile proximal (FISTA) approaches for routine checks and numerical comparisons. A coordinate descent approach is also included, yet only for the elastic.net fitting procedure. The default setting uses the quadratic approach that gives its name to the package. It has been optimized to be the method of choice for small and medium scale problems, and produce very accurate solutions. However, the first order methods (coordinate descent and FISTA) can be interesting in situations where the problem is close to singular, in which case the Cholesky decomposition used in the quadratic solver can be computationally unstable. Though it is extremely unlikely for elastic.net -- and if so, we encourage the user to send us back any report of such an event --, this happens at times with bounded.reg. Regarding this issue, we let the possibility for the user to run the optimization of the bounded.reg criterion in a (hopefully) 'bulletproof' mode: using mainly the fast and accurate quadratic approach, it switches to the slower but more robust proximal resolution when unstability is detected.

Technical remarks

Most of the numerical work is done in C++, relying on the RcppArmadillo package. We also provide a (double) cross-validation procedure and functions for stabilty selection, both using the multi-core capability of the computer, through the parallel package. This feature is not available for Windows user, though. Finally, note that the plot methods enjoy some (still very few) of the capabilities of the ggplot2 package. We hope to enrich quadrupen with other popular fitting procedures and develop other statistical tools, particularly towards bootstrapping and model selection purpose. Sparse matrix encoding is partially supported at the moment, and will hopefully be thoroughly available in the future, thanks to upcoming updates of the great RcppArmadillo package.

References

Yves Grandvalet, Julien Chiquet and Christophe Ambroise, Sparsity by Worst-case Quadratic Penalties, arXiv preprint, 2012.

Nicolas Meinshausen and Peter Buhlmann. Stability Selection, JRSS(B), 2010.
Martin Slawski, Wolfgang zu Castell, and Gerhard Tutz. Feature selection guided by structural information, AOAS, 2010.
Peng Zhao, Guillerme Rocha and Bin Yu. The composite absolute penalties family for grouped and hierarchical variable selection, The Annals of Statistics, 2009.
Hui Zou. The Adaptive Lasso and Its Oracle Properties, JASA, 2006.
Hui Zou and Trevor Hastie. Regularization and variable selection via the elastic net, JRSS(B), 2006.
Robert Tibshirani. Regression Shrinkage and Selection via the Lasso, JRSS(B), 1996.