hqreg: Fit a robust regression model with Huber or quantile loss penalized by lasso or elasti-net

Description

Fit solution paths for Huber loss regression or quantile regression penalized by lasso or elastic-net over a grid of values for the regularization parameter lambda.

Usage

hqreg(X, y, method = c("huber", "quantile", "ls"),
    gamma, tau = 0.5, alpha = 1, nlambda = 100, 
    lambda.min = ifelse(nrow(X)>ncol(X), 0.001, 0.05), lambda, 
    preprocess = c("standardize", "rescale", "none"), screen = c("ASR", "SR", "none"), 
    max.iter = 10000, eps = 1e-7, dfmax = ncol(X)+1, penalty.factor = rep(1, ncol(X)), 
    message = FALSE)

Arguments

Input matrix.

Response vector.

method

The loss function to be used in the model. Either "huber" (default), "quantile", or "ls" for least squares (see Details).

gamma

The tuning parameter of Huber loss, with no effect for the other loss functions. Huber loss is quadratic for absolute values less than gamma and linear for those greater. If missing, the function uses a default value (5th percentile of the absol

tau

The tuning parameter of the quantile loss, with no effect for the other loss functions. It represents the conditional quantile of the response to be estimated, so must be a number between 0 and 1. It includes the absolute loss when tau = 0.5 (de

alpha

The elastic-net mixing parameter that controls the relative contribution from the lasso and the ridge penalty. It must be a number between 0 and 1. alpha=1 is the lasso penalty and alpha=0 the ridge penalty.

nlambda

The number of lambda values. Default is 100.

lambda.min

The smallest value for lambda, as a fraction of lambda.max, the data derived entry value. Default is 0.001 if the number of observations is larger than the number of variables and 0.05 otherwise.

lambda

A user-specified sequence of lambda values. Typical usage is to leave blank and have the program automatically compute a lambda sequence based on nlambda and lambda.min. Specifying lambda overr

preprocess

Preprocessing technique to be applied to the input. Either "standardize" (default), "rescale" or "none" (see Details). The coefficients are always returned on the original scale.

screen

Screening rule to be applied at each lambda that discards variables for speed. Either "ASR" (default), "SR" or "none". "SR" stands for the strong rule, and "ASR" for the adaptive strong rule. Using "ASR" typically requires fewer ite

max.iter

Maximum number of iterations. Default is 10000.

eps

Convergence threshold. The algorithms continue until the maximum change in the objective after any coefficient update is less than eps times the null deviance. Default is 1E-7.

dfmax

Upper bound for the number of nonzero coefficients. The algorithm exits and returns a partial path if dfmax is reached. Useful for very large dimensions.

penalty.factor

A numeric vector of length equal to the number of variables. Each component multiplies lambda to allow differential penalization. Can be 0 for some variables, in which case the variable is always in the model without penalization.

message

If set to TRUE, hqreg will inform the user of its progress. This argument is kept for debugging. Default is FALSE.

Value

The function returns an object of S3 class "hqreg", which is a list containing:
callThe call that produced this object.
betaThe fitted matrix of coefficients. The number of rows is equal to the number of coefficients, and the number of columns is equal to nlambda. An intercept is included.
iterA vector of length nlambda containing the number of iterations until convergence at each value of lambda.
saturatedA logical flag for whether the number of nonzero coefficients has reached dfmax.
lambdaThe sequence of regularization parameter values in the path.
alphaSame as above.
gammaSame as above. NULL except when method = "huber".
tauSame as above. NULL except when method = "quantile".
penalty.factorSame as above.
methodSame as above.
nvThe variable screening rules are accompanied with checks of optimality conditions. When violations occur, the program adds in violating variables and re-runs the inner loop until convergence. nv is the number of violations.

Details

The sequence of models indexed by the regularization parameter lambda is fit using a semismooth Newton coordinate descent algorithm. The objective function is defined to be $$\frac{1}{n} \sum loss_i + \lambda\textrm{penalty}.$$ For method = "huber", $$loss(t) = \frac{t^2}{2\gamma} I(|t|\le \gamma) + (|t| - \frac{\gamma}{2};) I(|t|> \gamma)$$ for method = "quantile", $$loss(t) = t (\tau - I(t<0));$$ for="" method = "ls", $$loss(t) = \frac{t^2}{2}$$ In the model, "t" is replaced by residuals.

The program supports different types of preprocessing techniques. They are applied to each column of the input matrix X. Let x be a column of X. For preprocess = "standardize", the formula is $$x' = \frac{x-mean(x)}{sd(x)};$$ for preprocess = "rescale", $$x' = \frac{x-min(x)}{max(x)-min(x)}.$$ The models are fit with preprocessed input, then the coefficients are transformed back to the original scale via some algebra.

Examples

Run this code

X = matrix(rnorm(1000*100), 1000, 100)
beta = rnorm(10)
eps = 4*rnorm(1000)
y = drop(X[,1:10] %*% beta + eps)

# Huber loss
fit1 = hqreg(X, y)
coef(fit1, 0.01)
predict(fit1, X[1:5,], lambda = c(0.02, 0.01))

# Quantile loss
fit2 = hqreg(X, y, method = "quantile", tau = 0.2)
plot(fit2, xvar = "norm")

# Squared loss
fit3 = hqreg(X, y, method = "ls", preprocess = "rescale")
plot(fit3, xvar = "lambda", log.x = TRUE)