Features
At the moment, two R fitting functions are
available:
- the
elastic.net function, which
solves a family of linear regression problems penalized
by a mixture of $l1$ and $l2$
norms. It notably includes the LASSO (Tibshirani, 1996),
the adaptive-LASSO (Zou, 2006), the Elastic-net (Zou and
Hastie, 2006) or the Structured Elastic-net (Slawski et
al., 2010). See examples as well as the available
demo(quad_enet).
- the
bounded.reg function, which fits
a linear model penalized by a mixture of
$infinity$ and $l2$ norms.
It owns the same versatility as the elastic.net
function regarding the $l2$ norm, yet the
$l1$-norm is replaced by the infinity norm.
Check demo(quad_breg) and examples.
The problem commonly solved for these two functions
writes
βhat
λ1,λ2 =
argminβ 1/2 RSS(&beta) +
λ1 | D β |q +
λ/2 2 βT S β,
where $q=1$ for
elastic.net and $q=infinity$ for
bounded.reg. The diagonal matrix $D$
allows different weights for the first part of the
penalty. The structuring matrix $S$ can be used to
introduce some prior information regarding the
predictors. It is provided via a positive semidefinite
matrix. The S4 objects produced by the fitting procedures own the
classical methods for linear model in R, as well
as methods for plotting, (double) cross-validation and
for the stability selection procedure of Meinshausen and
Buhlmann (2010). All the examples of this documentation have been included
to the package source, in the 'examples' directory. Some
(too few!) routine testing scripts using the
testhat package are also present in the 'tests'
directory, where we check basic functionalities of the
code, especially the reproducibility of the
Lasso/Elastic-net solution path with the lars,
elasticnet and glmnet packages. We also
check the handling of runtime errors or unstabilities.Algorithm
The general strategy of the algorithm relies on
maintaining an active set of variables, starting from a
vector of zeros. The underlying optimization problem is
solved only on the activated variables, thus handling
with small smooth problems with increasing size. Hence,
by considering a decreasing grid of values for the
penalty $lambda1$ and fixing
$lambda2$, we may explore the whole path
of solutions at a reasonable numerical cost, providing
that $lambda1$ does not end up too small. For the $l1$-based methods (available in the
elastic.net function), the size of the underlying
problems solved is related to the number of nonzero
coefficients in the vector of parameters. With the
$infinity$-norm, (available in the
boundary.reg function), we do not produce sparse
estimator. Nevertheless, the size of the systems solved
along the path deals with the number of unbounded
variables for the current penalty level, which is quite
smaller than the number of predictors for a reasonable
$lambda1$. The same kind of proposal was
made in Zhao, Rocha and Yu (2009). Underlying optimization is performed by direct resolution
of quadratic sub problems, which is the main purpose of
this package. This strategy is thoroughly exposed in
Grandvalet, Chiquet and Ambroise (submitted). Still, we
also implemented the popular and versatile proximal
(FISTA) approaches for routine checks and numerical
comparisons. A coordinate descent approach is also
included, yet only for the elastic.net fitting
procedure. The default setting uses the quadratic approach that
gives its name to the package. It has been optimized to
be the method of choice for small and medium scale
problems, and produce very accurate solutions. However,
the first order methods (coordinate descent and FISTA)
can be interesting in situations where the problem is
close to singular, in which case the Cholesky
decomposition used in the quadratic solver can be
computationally unstable. Though it is extremely unlikely
for elastic.net -- and if so, we encourage
the user to send us back any report of such an event --,
this happens at times with bounded.reg.
Regarding this issue, we let the possibility for the user
to run the optimization of the bounded.reg
criterion in a (hopefully) 'bulletproof' mode: using
mainly the fast and accurate quadratic approach, it
switches to the slower but more robust proximal
resolution when unstability is detected.Technical remarks
Most of the numerical work is done in C++, relying on the
RcppArmadillo package. We also provide a (double)
cross-validation procedure and functions for stabilty
selection, both using the multi-core capability of the
computer, through the parallel package. This
feature is not available for Windows user, though.
Finally, note that the plot methods enjoy some (still
very few) of the capabilities of the ggplot2
package. We hope to enrich quadrupen with other popular
fitting procedures and develop other statistical tools,
particularly towards bootstrapping and model selection
purpose. Sparse matrix encoding is partially supported at
the moment, and will hopefully be thoroughly available in
the future, thanks to upcoming updates of the great
RcppArmadillo package.