Learn R Programming

crrp (version 1.0)

crrp: Penalized variable selection at the individual level in competing risks regression

Description

Extends R package ncvreg to the proportional subdistribution hazards model. Penalties include LASSO, SCAD, and MCP. User-specified weights can be assigned to the penalty for each coefficient.

Usage

crrp(time, fstatus, X, failcode = 1, cencode = 0, penalty = c("MCP", "SCAD", "LASSO"), gamma = switch(penalty, SCAD = 3.7, 2.7), alpha = 1, lambda.min = 0.001, nlambda = 50, lambda, eps = 0.001, max.iter = 1000, penalty.factor = rep(1, ncol(X)), weighted = FALSE)

Arguments

time
vector of failure/censoring times
fstatus
vector with a unique code for each failure type and a separate code for censored observations
X
design matrix; crrp standardizes X by default
failcode
code of fstatus that denotes the failure type of interest
cencode
code of fstatus that denotes censored observations
penalty
penalty to be applied to the model. Either "LASSO", "SCAD", or "MCP"
gamma
tuning parameter of the MCP/SCAD penalty. Default is 2.7 for MCP and 3.7 for SCAD
alpha
tuning parameter indicating contributions from the MCP/SCAD penalty and the L2 penalty. alpha=1 is equivalent to MCP/SCAD penalty, whereas alpha=0 would be equivalent to ridge regression. Default is 1
lambda.min
the smallest value for lambda. Default is .001
nlambda
number of lambda values. Default is 50
lambda
a user-specified sequence of lambda values. If not specified, a sequence of values of length nlambda is provided
eps
iteration stops when the relative change in any coefficient is less than eps. Default is 0.001
max.iter
maximum number of iterations. Default is 1000
penalty.factor
a vector of weights applied to the penalty for each coefficient. The length of the vector must be equal to the number of columns of X
weighted
if TRUE, weights must be provided by users. Default is FALSE

Value

Return a list of class crrp with components
$beta
fitted coefficients matrix with nvars row and nlambda columns
$iter
number of iterations until convergence for each lambda
$lambda
sequence of tuning parameter values
$penalty
same as above
$gamma
same as above
$alpha
same as above
$loglik
log likelihood of the fitted model at each value of lambda
$GCV
generalized cross validation of the fitted model at each value of lambda
$BIC
Bayesian information criteria of the fitted model at each value of lambda
$SE
matrix of standard errors with nvars row and nlambda columns

Details

The crrp function penalizes the partial likelihood of the proportional subdistribution hazards model from Fine and Gray(1999) with penalty LASSO, SCAD, and MCP. The coordinate algorithm is used for implementation. The criteria BIC and GCV are used to select the optimal tuning parameter.

References

  • Breheny, P. and Huang, J. (2011) Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Ann. Appl. Statist., 5: 232-253.
  • Fine J. and Gray R. (1999) A proportional hazards model for the subdistribution of a competing risk. JASA 94:496-509.
  • Fu Z., Parikh C. and Zhou B.(2015). Penalized variable selection in competing risks regression. Manuscript submitted for publication.

See Also

gcrrp, cmprsk, ncvreg

Examples

Run this code
  #simulate competing risks data
  set.seed(10)
  ftime <- rexp(200)
  fstatus <- sample(0:2,200,replace=TRUE)
  cov <- matrix(runif(1000),nrow=200)
  dimnames(cov)[[2]] <- c('x1','x2','x3','x4','x5')
  
  #fit LASSO
  fit <- crrp(ftime, fstatus, cov, penalty="LASSO")
  #use BIC to select tuning parameters
  beta <- fit$beta[, which.min(fit$BIC)]
  beta.se <- fit$SE[, which.min(fit$BIC)]
  
  #fit adaptive LASSO
  weight <- 1/abs(crr(ftime, fstatus, cov)$coef)
  fit2 <-crrp(ftime, fstatus, cov, penalty="LASSO", penalty.factor=weight, weighted=TRUE)
  beta2 <- fit2$beta[, which.min(fit2$BIC)]
  beta2.se <- fit2$SE[, which.min(fit2$BIC)]
  

Run the code above in your browser using DataLab