glmreg: fit a GLM with lasso (or elastic net), snet or mnet regularization

Description

Fit a generalized linear model via penalized maximum likelihood. The regularization path is computed for the lasso (or elastic net penalty), scad (or snet) and mcp (or mnet penalty), at a grid of values for the regularization parameter lambda. Fits linear, logistic, Poisson and negative binomial (fixed scale parameter) regression models.

Usage

## S3 method for class 'formula':
glmreg(formula, data, weights, offset=NULL, contrasts=NULL, 
x.keep=FALSE, y.keep=TRUE, ...)
## S3 method for class 'matrix':
glmreg(x, y, weights, offset=NULL, ...)
## S3 method for class 'default':
glmreg(x,  ...)

Arguments

formula

symbolic description of the model, see details.

data

argument controlling formula processing via model.frame.

weights

optional numeric vector of weights. If standardize=TRUE, weights are renormalized to weights/sum(weights). If standardize=FALSE, weights are kept as original input

input matrix, of dimension nobs x nvars; each row is an observation vector

response variable. Quantitative for family="gaussian". Non-negative counts for family="poisson" or family="negbin". For family="binomial" should be either a factor with two levels or a vector of pr

x.keep, y.keep

logical values: keep response variables or keep response variable?

offset

Not implemented yet

contrasts

the contrasts corresponding to levels from the respective models

...

Other arguments passing to glmreg_fit

Value

An object with S3 class "glmreg" for the various types of models.
callthe call that produced this object
b0Intercept sequence of length length(lambda)
betaA nvars x length(lambda) matrix of coefficients.
lambdaThe actual sequence of lambda values used
devThe computed deviance (for "gaussian", this is the R-square). The deviance calculations incorporate weights if present in the model. The deviance is defined to be 2*(loglike_sat - loglike), where loglike_sat is the log-likelihood for the saturated model (a model with a free parameter per observation).
nulldevNull deviance (per observation). This is defined to be 2*(loglike_sat -loglike(Null)); The NULL model refers to the intercept model.
nobsnumber of observations
pllpenalized log-likelihood values for standardized coefficients in the IRLS iterations.
pllrespenalized log-likelihood value for the estimated model on the original scale of coefficients

Details

The sequence of models implied by lambda is fit by coordinate descent. For family="gaussian" this is the lasso, mcp or scad sequence if alpha=1, else it is the enet, mnet or snet sequence. For the other families, this is a lasso (mcp, scad) or elastic net (mnet, snet) regularization path for fitting the generalized linear regression paths, by maximizing the appropriate penalized log-likelihood. Note that the objective function for "gaussian" is $$1/2* weights*RSS + \lambda*penalty,$$ if standardize=FALSE and $$1/2* \frac{weights}{\sum(weights)}*RSS + \lambda*penalty,$$ if standardize=TRUE. For the other models it is $$-\sum (weights * loglik) + \lambda*penalty$$ if standardize=FALSE and $$-\frac{weights}{\sum(weights)} * loglik + \lambda*penalty$$ if standardize=TRUE.

References

Breheny, P. and Huang, J. (2011) Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Ann. Appl. Statist., 5: 232-253.

Zhu Wang, Shuangge Ma, Michael Zappitelli, Chirag Parikh, Ching-Yun Wang and Prasad Devarajan (2014) Penalized Count Data Regression with Application to Hospital Stay after Pediatric Cardiac Surgery, Statistical Methods in Medical Research. 2014 Apr 17. [Epub ahead of print]

Examples

Run this code

#binomial
x=matrix(rnorm(100*20),100,20)
g2=sample(0:1,100,replace=TRUE)
fit2=glmreg(x,g2,family="binomial")
#poisson and negative binomial
data("bioChemists", package = "pscl")
fm_pois <- glmreg(art ~ ., data = bioChemists, family = "poisson")
coef(fm_pois)
fm_nb1 <- glmreg(art ~ ., data = bioChemists, family = "negbin", theta=1)
coef(fm_nb1)
fm_nb2 <- glmregNB(art ~ ., data = bioChemists)
coef(fm_nb2)

Run the code above in your browser using DataLab