gcrrp: Group penalized variable selection in competing risks regression

Description

Extends R package grpreg to the proportional subdistribution hazards (PSH) model (Fine and Gray, 1999). Performs penalized variable selection at the group level. Penalties include group LASSO, adaptive group LASSO, group SCAD, and group MCP.

Usage

gcrrp(time, fstatus, X, failcode = 1, cencode = 0, group=1:ncol(X), penalty=c("gLASSO", "gMCP", "gSCAD"),gamma=switch(penalty, SCAD=3.7, 2.7), alpha=1, lambda.min=0.001, nlambda=50, lambda, eps=.001,  max.iter=1000, weighted=FALSE)

Arguments

time

vector of failure/censoring times

fstatus

vector with a unique code for each failure type and a separate code for censored observations

design matrix; crrp standardizes and orthogonizes X by default

failcode

code of fstatus that denotes the failure type of interest

cencode

code of fstatus that denotes censored observations

group

vector of group indicator (see details)

penalty

penalty to be applied to the model. Either "gLASSO", "gSCAD", or "gMCP"

gamma

tuning parameter of the gMCP/gSCAD penalty. Default is 2.7 for group MCP and 3.7 for group SCAD.

alpha

tuning parameter indicating contributions from the MCP/SCAD penalty and the L2 penalty.

lambda.min

the smallest value for lambda. Default is .001

nlambda

number of lambda values. Default is 50

lambda

a user-specified sequence of lambda values. If not specified, a sequence of values of length nlambda is provided

eps

iteration stops when the relative change in any coefficient is less than eps. Default is 0.001

max.iter

maximum number of iterations. Default is 1000

weighted

Default is FALSE. If TRUE, it must be used with gLASSO to produce adaptive group LASSO penalty(see details)

Value

$beta: fitted coefficients matrix with nvars row and nlambda columns
$iter: number of iterations until convergence for each lambda
$group: same as above
$lambda: sequence of tuning parameter values
$penalty: same as above.
$gamma: same as above.
$alpha: same as above.
$loglik: log likelihood of the fitted model at each value of lambda
$GCV: generalized cross validation of the fitted model at each value of lambda
$BIC: Bayesian information criteria of the fitted model at each value of lambda

Details

The group vector indicates the grouping of variables. For greatest efficiency, group should be a vector of consecutive integers, although unordered groups are also allowed.

Penalties include group LASSO, group SCAD, and group MCP. We also include adaptive group LASSO by putting weighted=TRUE. The gcrrp function calculates data-adaptive weights formulated by the maximum parital likelihood estimator(MPLE) of the PSH model. The weight for each group is the inverse of the norm of the corresponding sub-vector of MPLE. The algorithm employed is the group coordinate descent algorithm.

References

Breheny, P. and Huang, J. (2012) Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors. Statistics and Computing
Fine J. and Gray R. (1999) A proportional hazards model for the subdistribution of a competing risk. JASA 94:496-509.
Fu Z., Parikh C. and Zhou B.(2015). Penalized variable selection in competing risks regression. Manuscript submitted for publication.
Huang J., Breheny, P. and Ma, S. (2012). A selective review of group selection in high dimensional models. Statistical Science, 27: 481-499.

Examples

Run this code

  set.seed(10)
  ftime <- rexp(200)
  fstatus <- sample(0:2,200,replace=TRUE)
  cov <- matrix(runif(2000),nrow=200)
  dimnames(cov)[[2]] <- paste("x", 1:ncol(cov))
  group <- c(1,1,2,2,2,3,4,4,5,5)
  #fit gSCAD penalty
  fit1 <- gcrrp(ftime, fstatus, cov, group=group, penalty="gSCAD")
  beta1 <- fit1$beta[, which.min(fit1$BIC)]
  #fit adaptive gLASSO
  fit2 <- gcrrp(ftime, fstatus, cov, group=group, penalty="gLASSO", weighted=TRUE)
  beta2 <- fit2$beta[, which.min(fit2$BIC)]

Run the code above in your browser using DataLab