Adjust a linear model penalized by a mixture of a (possibly weighted) \(\ell_\infty\)-norm (bounding the magnitude of the parameters) and a (possibly structured) \(\ell_2\)-norm (ridge-like). The solution path is computed at a grid of values for the infinity-penalty, fixing the amount of \(\ell_2\) regularization. See details for the criterion optimized.
bounded.reg(
x,
y,
lambda1 = NULL,
lambda2 = 0.01,
penscale = rep(1, p),
struct = NULL,
intercept = TRUE,
normalize = TRUE,
naive = FALSE,
nlambda1 = ifelse(is.null(lambda1), 100, length(lambda1)),
min.ratio = ifelse(n an object with class quadrupen, see the
documentation page quadrupen for details.
matrix of features, possibly sparsely encoded
(experimental). Do NOT include intercept. When normalized os
TRUE, coefficients will then be rescaled to the original
scale.
response vector.
sequence of decreasing \(\ell_\infty\)
penalty levels. If NULL (the default), a vector is
generated with nlambda1 entries, starting from a guessed
level lambda1.max where only the intercept is included,
then shrunken to min.ratio*lambda1.max.
real scalar; tunes the \(\ell_2\)-penalty in the bounded regression. Default is 0.01. Set to 0 to regularize only by the infinity norm (be careful regarding numerical stability in that case, particularly in the high dimensional setting).
vector with real positive values that weight the infinity norm of each feature. Default set all weights to 1. See details below.
matrix structuring the coefficients. Must be at
least positive semidefinite (this is checked internally if the
checkarg argument is TRUE). The
default uses the identity matrix. See details below.
logical; indicates if an intercept should be
included in the model. Default is TRUE.
logical; indicates if variables should be
normalized to have unit L2 norm before fitting. Default is
TRUE.
logical; Compute either 'naive' of 'classic' bounded
regression: mimicking the Elastic-net, the vector of parameters is
rescaled by a coefficient (1+lambda2) when naive
equals FALSE. No rescaling otherwise. Default is
FALSE.
integer that indicates the number of values to put
in the lambda1 vector. Ignored if lambda1 is
provided.
minimal value of infinity-part of the penalty
that will be tried, as a fraction of the maximal lambda1
value. A too small value might lead to unstability at the end of
the solution path corresponding to small lambda1. The
default value tries to avoid this, adapting to the
'\(n<p\)' context. Ignored if lambda1 is provided.
integer; limits the number of features ever to
enter the model: in our implementation of the bounded regression,
it corresponds to the variables which have left the boundary along
the path. The algorithm stops if this number is exceeded and
lambda1 is cut at the corresponding level. Default is
min(nrow(x),ncol(x)) for small lambda2 (<0.01) and
min(4*nrow(x),ncol(x)) otherwise. Use with care, as it
considerably changes the computation time.
list of argument controlling low level options of the algorithm --use with care and at your own risk-- :
verbose: integer; activate verbose mode --this one
is not too much risky!-- set to 0 for no output; 1
for warnings only, and 2 for tracing the whole
progression. Default is 1. Automatically set to 0
when the method is embedded within cross-validation or stability
selection.
timer: logical; use to record the timing of the
algorithm. Default is FALSE.
max.iter: the maximal number of iteration used to
solve the problem for a given value of lambda1. Default is
500.
method: a string for the underlying solver
used. Either "quadra" or "fista" are available for
bounded regression. Default is "quadra".
threshold: a threshold for convergence. The
algorithm stops when the optimality conditions are fulfill up to
this threshold. Default is 1e-7 for "quadra" and
1e-2 for "fista".
bulletproof: logical; indicates if the bulletproof
mode should be used while running the "quadra"
method. Default is TRUE. See details below.
logical; should arguments be checked to
(hopefully) avoid internal crashes? Default is
TRUE. Automatically set to FALSE when calls are made
from cross-validation or stability selection procedures.
The optimized criterion is βhatλ1,λ2 = argminβ 1/2 RSS(β) + λ1 | D β |∞ + λ/2 2 βT S β,
where
\(D\) is a diagonal matrix, whose diagonal terms are provided
as a vector by the penscale argument. The \(\ell_2\)
structuring matrix \(S\) is provided via the struct
argument, a positive semidefinite matrix (possibly of class
Matrix).
Note that the quadratic algorithm for the bounded regression may
become unstable along the path because of singularity of the
underlying problem, e.g. when there are too much correlation or
when the size of the problem is close to or smaller than the
sample size. In such cases, it might be a good idea to switch to
the proximal solver, slower yet more robust. This is the strategy
adopted by the 'bulletproof' mode, that will send a warning
while switching the method to 'fista' and keep on
optimizing on the remainder of the path. When bulletproof
is set to FALSE, the algorithm stops at an early stage of
the path of solutions. Hence, users should be careful when
manipulating the resulting 'quadrupen' object, as it will
not have the size expected regarding the dimension of the
lambda1 argument.
Singularity of the system can also be avoided with a larger
\(\ell_2\)-regularization, via lambda2, or a
"not-too-small" \(\ell_\infty\) regularization, via
a larger 'min.ratio' argument.
See also quadrupen,
plot,quadrupen-method and crossval.
## Simulating multivariate Gaussian with blockwise correlation
## and piecewise constant vector of parameters
beta <- rep(c(0,1,0,-1,0), c(25,10,25,10,25))
cor <- 0.75
Soo <- toeplitz(cor^(0:(25-1))) ## Toeplitz correlation for irrelevant variables
Sww <- matrix(cor,10,10) ## bloc correlation between active variables
Sigma <- bdiag(Soo,Sww,Soo,Sww,Soo)
diag(Sigma) <- 1
n <- 50
x <- as.matrix(matrix(rnorm(95*n),n,95) %*% chol(Sigma))
y <- 10 + x %*% beta + rnorm(n,0,10)
## Infinity norm without/with an additional l2 regularization term
## and with structuring prior
labels <- rep("irrelevant", length(beta))
labels[beta != 0] <- "relevant"
plot(bounded.reg(x,y,lambda2=0) , label=labels) ## a mess
plot(bounded.reg(x,y,lambda2=10), label=labels) ## good guys are at the boundaries
Run the code above in your browser using DataLab