glmpath: Fits the entire L1 regularization path for generalized linear models

Description

This algorithm uses predictor-corrector method to compute the entire regularization path for generalized linear models with L1 penalty.

Usage

glmpath(x, y, data, nopenalty.subset = NULL, family = binomial,
          weight = rep(1, n), offset = rep(0, n), lambda2 = 1e-5,
          max.steps = 10 * min(n, m), max.norm = 100 * m,
          min.lambda = (if (m >= n) 1e-6 else 0), max.vars = Inf,
          max.arclength = Inf, frac.arclength = 1, add.newvars = 1,
          bshoot.threshold = 0.1, relax.lambda = 1e-8,
          standardize = TRUE, eps = .Machine$double.eps,
          trace = FALSE)

Arguments

matrix of features

response

data

a list consisting of x: a matrix of features and y: response. data is not needed if x and y are input separately.

nopenalty.subset

a set of indices for the predictors that are not subject to the L1 penalty

family

name of a family function that represents the distribution of y to be used in the model. It must be binomial, gaussian, or poisson. For each one, the canonical link function is used; logit for binomial, identity for gaussian, and log for poisson distribution. Default is binomial.

weight

an optional vector of weights for observations

offset

an optional vector of offset. If a column of x is used as offset, the corresponding column must be removed from x.

lambda2

regularization parameter for the L2 norm of the coefficients. Default is 1e-5.

max.steps

an optional bound for the number of steps to be taken. Default is 10 * min{nrow(x), ncol(x)}.

max.norm

an optional bound for the L1 norm of the coefficients. Default is 100 * ncol(x).

min.lambda

an optional (lower) bound for the size of \(\lambda\). Default is 0 for ncol(x) < nrow(x) cases and 1e-6 otherwise.

max.vars

an optional bound for the number of active variables. Default is Inf.

max.arclength

an optional bound for arc length (L1 norm) of a step. If max.arclength is extremely small, an exact nonlinear path is produced. Default is Inf.

frac.arclength

Under the default setting, the next step size is computed so that the active set changes right at the next value of lambda. When frac.arclength is assigned some fraction between 0 and 1, the step size is decreased by the factor of frac.arclength in arc length. If frac.arclength=0.2, the step length is adjusted so that the active set would change after five smaller steps. Either max.arclength or frac.arclength can be used to force the path to be more accurate. Default is 1.

add.newvars

add.newvars candidate variables (that are currently not in the active set) are used in the corrector step as potential active variables. Default is 1.

bshoot.threshold

If the absolute value of a coefficient is larger than bshoot.threshold at the first corrector step it becomes nonzero (therefore when \(\lambda\) is considered to have been decreased too far), \(\lambda\) is increased again. i.e. A backward distance in \(\lambda\) that makes the coefficient zero is computed. Default is 0.1.

relax.lambda

A variable joins the active set if \(|l'(\beta)| > \lambda\)*(1-relax.lambda). Default is 1e-8. If no variable joins the active set even after many (>20) steps, the user should increase relax.lambda to 1e-7 or 1e-6, but not more than that. This adjustment is sometimes needed because of the numerical precision/error propagation problems. In general, the paths are less accurate with relaxed lambda.

standardize

If TRUE, predictors are standardized to have a unit variance.

eps

an effective zero

trace

If TRUE, the algorithm prints out its progress.

Value

A glmpath object is returned.

lambda

vector of \(\lambda\) values for which the exact coefficients are computed

lambda2

\(\lambda_2\) used

step.length

vector of step lengths in \(\lambda\)

corr

matrix of \(l'(\beta)\) values (derivatives of the log-likelihood)

new.df

vector of degrees of freedom (to be used in the plot function)

vector of degrees of freedom at each step

deviance

vector of deviance computed at each step

aic

vector of AIC values

bic

vector of BIC values

b.predictor

matrix of coefficient estimates from the predictor steps

b.corrector

matrix of coefficient estimates from the corrector steps

new.A

vector of boolean values indicating the steps at which the active set changed (to be used in the plot/predict functions)

actions

actions taken at each step

meanx

means of the columns of x

sdx

standard deviations of the columns of x

xnames

column names of x

family

family used

weight

weights used

offset

offset used

nopenalty.subset

nopenalty.subset used

standardize

TRUE if the predictors were standardized before fitting

Details

This algorithm implements the predictor-corrector method to determine the entire path of the coefficient estimates as the amount of regularization varies; it computes a series of solution sets, each time estimating the coefficients with less regularization, based on the previous estimate. The coefficients are estimated with no error at the knots, and the values are connected, thereby making the paths piecewise linear.

References

Mee Young Park and Trevor Hastie (2007) L1 regularization path algorithm for generalized linear models. J. R. Statist. Soc. B, 69, 659-677.

Examples

Run this code

# NOT RUN {
data(heart.data)
attach(heart.data)
fit.a <- glmpath(x, y, family=binomial)
fit.b <- glmpath(x, y, family=gaussian)
detach(heart.data)
# }

Run the code above in your browser using DataLab