ncpen.reg: ncpen.reg: nonconvex penalized estimation

Description

Fits generalized linear models by penalized maximum likelihood estimation. The coefficients path is computed for the regression model over a grid of the regularization parameter lambda. Fits Gaussian (linear), binomial Logit (logistic), Poisson, multinomial Logit regression models, and Cox proportional hazard model with various non-convex penalties.

Usage

ncpen.reg(formula, data, family = c("gaussian", "linear", "binomial",
  "logit", "multinomial", "cox", "poisson"), penalty = c("scad", "mcp",
  "tlp", "lasso", "classo", "ridge", "sridge", "mbridge", "mlog"),
  x.standardize = TRUE, intercept = TRUE, lambda = NULL,
  n.lambda = NULL, r.lambda = NULL, w.lambda = NULL, gamma = NULL,
  tau = NULL, alpha = NULL, df.max = 50, cf.max = 100,
  proj.min = 10, add.max = 10, niter.max = 30, qiter.max = 10,
  aiter.max = 100, b.eps = 1e-07, k.eps = 1e-04, c.eps = 1e-06,
  cut = TRUE, local = FALSE, local.initial = NULL)

Arguments

formula

(formula) regression formula. To include/exclude intercept, use intercept option instead of using the "0 +" option in the formula. The y value must be 0,1 for binomial and 1,2,..., for multinomial.

data

(numeric matrix or data.frame) contains both y and X. Each row is an observation vector. The censoring indicator must be included at the last column of the data for cox.

family

(character) regression model. Supported models are gaussian (or linear), binomial (or logit), poisson, multinomial, and cox. Default is gaussian.

penalty

(character) penalty function. Supported penalties are scad (smoothly clipped absolute deviation), mcp (minimax concave penalty), tlp (truncated LASSO penalty), lasso (least absolute shrinkage and selection operator), classo (clipped lasso = mcp + lasso), ridge (ridge), sridge (sparse ridge = mcp + ridge), mbridge (modified bridge) and mlog (modified log). Default is scad.

x.standardize

(logical) whether to standardize x.mat prior to fitting the model (see details). The estimated coefficients are always restored to the original scale.

intercept

(logical) whether to include an intercept in the model.

lambda

(numeric vector) user-specified sequence of lambda values. Default is supplied automatically from samples.

n.lambda

(numeric) the number of lambda values. Default is 100.

r.lambda

(numeric) ratio of the smallest lambda value to largest. Default is 0.001 when n>p, and 0.01 for other cases.

w.lambda

(numeric vector) penalty weights for each coefficient (see references). If a penalty weight is set to 0, the corresponding coefficient is always nonzero.

gamma

(numeric) additional tuning parameter for controlling shrinkage effect of classo and sridge (see references). Default is half of the smallest lambda.

tau

(numeric) concavity parameter of the penalties (see reference). Default is 3.7 for scad, 2.1 for mcp, classo and sridge, 0.001 for tlp, mbridge and mlog.

alpha

(numeric) ridge effect (weight between the penalty and ridge penalty) (see details). Default value is 1. If penalty is ridge and sridge then alpha is set to 0.

df.max

(numeric) the maximum number of nonzero coefficients.

cf.max

(numeric) the maximum of absolute value of nonzero coefficients.

proj.min

(numeric) the projection cycle inside CD algorithm (largely internal use. See details).

add.max

(numeric) the maximum number of variables added in CCCP iterations (largely internal use. See references).

niter.max

(numeric) maximum number of iterations in CCCP.

qiter.max

(numeric) maximum number of quadratic approximations in each CCCP iteration.

aiter.max

(numeric) maximum number of iterations in CD algorithm.

b.eps

(numeric) convergence threshold for coefficients vector.

k.eps

(numeric) convergence threshold for KKT conditions.

c.eps

(numeric) convergence threshold for KKT conditions (largely internal use).

cut

(logical) convergence threshold for KKT conditions (largely internal use).

local

(logical) whether to use local initial estimator for path construction. It may take a long time.

local.initial

(numeric vector) initial estimator for local=TRUE.

Value

An object with S3 class ncpen.

y.vec

response vector.

x.mat

design matrix.

family

regression model.

penalty

penalty.

x.standardize

whether to standardize x.mat=TRUE.

intercept

whether to include the intercept.

std

scale factor for x.standardize.

lambda

sequence of lambda values.

w.lambda

penalty weights.

gamma

extra shrinkage parameter for classo and sridge only.

alpha

ridge effect.

local

whether to use local initial estimator.

local.initial

local initial estimator for local=TRUE.

beta

fitted coefficients. Use coef.ncpen for multinomial since the coefficients are represented as vectors.

the number of non-zero coefficients.

Details

The sequence of models indexed by lambda is fit by using concave convex procedure (CCCP) and coordinate descent (CD) algorithm (see references). The objective function is $$ (sum of squared residuals)/2n + [alpha*penalty + (1-alpha)*ridge] $$ for gaussian and $$ (log-likelihood)/n - [alpha*penalty + (1-alpha)*ridge] $$ for the others, assuming the canonical link. The algorithm applies the warm start strategy (see references) and tries projections after proj.min iterations in CD algorithm, which makes the algorithm fast and stable. x.standardize makes each column of x.mat to have the same Euclidean length but the coefficients will be re-scaled into the original. In multinomial case, the coefficients are expressed in vector form. Use coef.ncpen.

References

Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American statistical Association, 96, 1348-60. Zhang, C.H. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of statistics, 38(2), 894-942. Shen, X., Pan, W., Zhu, Y. and Zhou, H. (2013). On constrained and regularized high-dimensional regression. Annals of the Institute of Statistical Mathematics, 65(5), 807-832. Kwon, S., Lee, S. and Kim, Y. (2016). Moderately clipped LASSO. Computational Statistics and Data Analysis, 92C, 53-67. Kwon, S. Kim, Y. and Choi, H.(2013). Sparse bridge estimation with a diverging number of parameters. Statistics and Its Interface, 6, 231-242. Huang, J., Horowitz, J.L. and Ma, S. (2008). Asymptotic properties of bridge estimators in sparse high-dimensional regression models. The Annals of Statistics, 36(2), 587-613. Zou, H. and Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models. Annals of statistics, 36(4), 1509. Lee, S., Kwon, S. and Kim, Y. (2016). A modified local quadratic approximation algorithm for penalized optimization problems. Computational Statistics and Data Analysis, 94, 275-286.

Examples

Run this code

# NOT RUN {
### linear regression with scad penalty
sam =  sam.gen.ncpen(n=200,p=5,q=5,cf.min=0.5,cf.max=1,corr=0.5,family="gaussian")
x.mat = sam$x.mat; y.vec = sam$y.vec
data = cbind(y.vec, x.mat)
colnames(data) = c("y", paste("xv", 1:ncol(x.mat), sep = ""))
fit1 = ncpen.reg(formula = y ~ xv1 + xv2 + xv3 + xv4 + xv5, data = data,
                 family="gaussian", penalty="scad")
fit2 = ncpen(y.vec=y.vec,x.mat=x.mat);

# }

Run the code above in your browser using DataLab