elnet: Elastic Net Estimator for Regression

Description

Estimate the elastic net regression coefficients.

Usage

elnet(x, y, alpha, nlambda = 100, lambda, weights, intercept = TRUE,
  options = en_options_aug_lars(), lambda_min_ratio, xtest,
  correction = TRUE)

Arguments

data matrix with predictors

response vector

alpha

controls the balance between the L1 and the L2 penalty. alpha = 0 is the ridge (L2) penalty, alpha = 1 is the lasso.

nlambda

size of the lambda grid if lambda is not specified.

lambda

a grid of decreasing lambda values.

weights

an optional vector of weights to be used in the fitting process. Should be NULL or a numeric vector. If non-NULL, weighted EN is used with weights weights. See also 'Details'.

intercept

should an intercept be estimated?

options

additional options for the EN algorithm. See en_options for details.

lambda_min_ratio

if the lambda grid should be automatically defined, the ratio of the smallest to the largest lambda value in the grid. The default is 1e-6 if $n < p$, and 1e-5 * 10^floor(log10(p / n)) otherwise.

xtest

data matrix with predictors used for prediction. This is useful for testing the prediction performance on an independent test set.

correction

should the "corrected" EN estimator be returned. If TRUE, as by default, the corrected estimator as defined in Zou & Hastie (2005) is returned.

Value

lambda

vector of lambda values.

status

integer specifying the exit status of the EN algorithm.

message

explanation of the exit status.

coefficients

matrix of regression coefficients. Each column corresponds to the estimate for the lambda value at the same index.

residuals

matrix of residuals. Each column corresponds to the residuals for the lambda value at the same index.

predictions

if xtest was given, matrix of predicted values. Each column corresponds to the predictions for the lambda value at the same index.

Algorithms

Currently this function can compute the elastic net estimator using either augmented LARS or the Dual Augmented Lagrangian (DAL) algorithm (Tomioka 2011). Augmented LARS performs LASSO via the LARS algorithm (or OLS if alpha = 0) on the data matrix augmented with the L2 penalty term. The time complexity of this algorithm increases fast with an increasing number of predictors. The algorithm currently can not leverage a previous or an approximate solution to speed up computations. However, it is always guaranteed to find the solution.

DAL is an iterative algorithm directly minimizing the Elastic Net objective. The algorithm can take an approximate solution to the problem to speed up convergence. In the case of very small lambda values and a bad starting point, DAL may not converge to the solution and hence give wrong results. This would be indicated in the returned status code. Time complexity of this algorithm is dominated by the number of observations.

DAL is much faster for a small number of observations (< 200) and a large number of predictors, especially if an approximate solution is available.

Details

This solves the minimization problem $$\frac{1}{2 N} RSS + \lambda \left( \frac{(1 - \alpha)} {2} \| \beta \|_2^2 + \alpha \| \beta \|_1 \right)$$

If weights are supplied, the minimization problem becomes $$\frac{1}{2 N} \sum_{i = 1}^n w_i r_i^2 + \lambda \left( \frac{(1 - \alpha)} {2} \| \beta \|_2^2 + \alpha \| \beta \|_1 \right)$$

References

Tomioka, R., Suzuki, T. and Sugiyama, M. (2011). Super-Linear Convergence of Dual Augmented Lagrangian Algorithm for Sparse Learning. Journal of Machine Learning Research 12(May):1537-1586.

Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 67(2):301-320.

Examples

Run this code

# NOT RUN {
# Generate some dummy data
set.seed(12345)
n <- 30
p <- 15
x <- 1 + matrix(rnorm(n * p), ncol = p)
y <- x %*% c(2:5, numeric(p - 4)) + rnorm(n)

x_test <- matrix(rnorm(10 * n * p), ncol = p)
y_test <- drop(x_test %*% c(2:5, numeric(p - 4)) + rnorm(n))

# Compute the classical EN with predictions for x_test
set.seed(1234)
est <- elnet(
    x, y,
    alpha = 0.6,
    nlambda = 100,
    xtest = x_test
)

# Plot the RMSPE computed from the given test set
rmspe_test <- sqrt(colMeans((y_test - est$predictions)^2))
plot(est$lambda, rmspe_test, log = "x")

##
## For large data sets, the DAL algorithm is much faster
##
set.seed(12345)
n <- 100
p <- 1500
x <- 1 + matrix(rnorm(n * p), ncol = p)
y <- x %*% c(2:5, numeric(p - 4)) + rnorm(n)

x_test <- matrix(rnorm(10 * n * p), ncol = p)
y_test <- drop(x_test %*% c(2:5, numeric(p - 4)) + rnorm(n))

# The DAL algorithm takes ~1.5 seconds to compute the solution path
set.seed(1234)
system.time(
    est_dal <- elnet(
        x, y,
        alpha = 0.6,
        nlambda = 100,
        options = en_options_dal(),
        xtest = x_test
    )
)

# }
# NOT RUN {
# In comparison, the augmented LARS algorithm can take several minutes
set.seed(1234)
system.time(
    est_auglars <- elnet(
        x, y,
        alpha = 0.6,
        nlambda = 100,
        options = en_options_aug_lars(),
        xtest = x_test
    )
)
# }

Run the code above in your browser using DataLab