crrp: Penalized variable selection at the individual level in competing risks regression

Description

Extends R package ncvreg to the proportional subdistribution hazards model. Penalties include LASSO, SCAD, and MCP. User-specified weights can be assigned to the penalty for each coefficient.

Usage

crrp(time, fstatus, X, failcode = 1, cencode = 0, 
penalty = c("MCP", "SCAD", "LASSO"), gamma = switch(penalty, SCAD = 3.7, 2.7), 
alpha = 1, lambda.min = 0.001, nlambda = 50, lambda, eps = 0.001, 
max.iter = 1000, penalty.factor = rep(1, ncol(X)), weighted = FALSE)

Arguments

time

vector of failure/censoring times

fstatus

vector with a unique code for each failure type and a separate code for censored observations

design matrix; crrp standardizes X by default

failcode

code of fstatus that denotes the failure type of interest

cencode

code of fstatus that denotes censored observations

penalty

penalty to be applied to the model. Either "LASSO", "SCAD", or "MCP"

gamma

tuning parameter of the MCP/SCAD penalty. Default is 2.7 for MCP and 3.7 for SCAD

alpha

tuning parameter indicating contributions from the MCP/SCAD penalty and the L2 penalty. alpha=1 is equivalent to MCP/SCAD penalty, whereas alpha=0 would be equivalent to ridge regression. Default is 1

lambda.min

the smallest value for lambda. Default is .001

nlambda

number of lambda values. Default is 50

lambda

a user-specified sequence of lambda values. If not specified, a sequence of values of length nlambda is provided

eps

iteration stops when the relative change in any coefficient is less than eps. Default is 0.001

max.iter

maximum number of iterations. Default is 1000

penalty.factor

a vector of weights applied to the penalty for each coefficient. The length of the vector must be equal to the number of columns of X

weighted

if TRUE, weights must be provided by users. Default is FALSE

Value

$beta: fitted coefficients matrix with nvars row and nlambda columns
$iter: number of iterations until convergence for each lambda
$lambda: sequence of tuning parameter values
$penalty: same as above
$gamma: same as above
$alpha: same as above
$loglik: log likelihood of the fitted model at each value of lambda
$GCV: generalized cross validation of the fitted model at each value of lambda
$BIC: Bayesian information criteria of the fitted model at each value of lambda
$SE: matrix of standard errors with nvars row and nlambda columns

Details

The crrp function penalizes the partial likelihood of the proportional subdistribution hazards model from Fine and Gray(1999) with penalty LASSO, SCAD, and MCP. The coordinate algorithm is used for implementation. The criteria BIC and GCV are used to select the optimal tuning parameter.

References

Breheny, P. and Huang, J. (2011) Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Ann. Appl. Statist., 5: 232-253.
Fine J. and Gray R. (1999) A proportional hazards model for the subdistribution of a competing risk. JASA 94:496-509.
Fu Z., Parikh C. and Zhou B.(2015). Penalized variable selection in competing risks regression. Manuscript submitted for publication.

Examples

Run this code

  #simulate competing risks data
  set.seed(10)
  ftime <- rexp(200)
  fstatus <- sample(0:2,200,replace=TRUE)
  cov <- matrix(runif(1000),nrow=200)
  dimnames(cov)[[2]] <- c('x1','x2','x3','x4','x5')
  
  #fit LASSO
  fit <- crrp(ftime, fstatus, cov, penalty="LASSO")
  #use BIC to select tuning parameters
  beta <- fit$beta[, which.min(fit$BIC)]
  beta.se <- fit$SE[, which.min(fit$BIC)]
  
  #fit adaptive LASSO
  weight <- 1/abs(crr(ftime, fstatus, cov)$coef)
  fit2 <-crrp(ftime, fstatus, cov, penalty="LASSO", penalty.factor=weight, weighted=TRUE)
  beta2 <- fit2$beta[, which.min(fit2$BIC)]
  beta2.se <- fit2$SE[, which.min(fit2$BIC)]

Run the code above in your browser using DataLab