cv.ncvsurv: Cross-validation for ncvsurv

Description

Performs k-fold cross validation for MCP- or SCAD-penalized survival models over a grid of values for the regularization parameter lambda.

Usage

cv.ncvsurv(X, y, ..., cluster, nfolds=10, seed, returnY=FALSE,
trace=FALSE)

Arguments

The design matrix, as in ncvsurv.

The response matrix, as in ncvsurv.

...

Additional arguments to ncvsurv.

cluster

cv.ncvsurv can be run in parallel across a cluster using the parallel package. The cluster must be set up in advance using the makeCluster function from that pacakge. The cluster must then be passed to cv.ncvsurv (see example).

nfolds

The number of cross-validation folds. Default is 10.

seed

You may set the seed of the random number generator in order to obtain reproducible results.

returnY

Should cv.ncvsurv return the linear predictors from the cross-validation folds? Default is FALSE; if TRUE, this will return a matrix in which the element for row i, column j is the fitted value for observation i from the fold in which observation i was excluded from the fit, at the jth value of lambda. NOTE: The rows of Y are ordered by time on study, and therefore do not correspond to the original order of observations pased to cv.ncvsurv.

trace

If set to TRUE, cv.ncvsurv will inform the user of its progress by announcing the beginning of each CV fold. Default is FALSE.

Value

An object with S3 class "cv.ncvsurv" inheriting from "cv.ncvreg" and containing:

Details

The function calls ncvsurv nfolds times, each time leaving out 1/nfolds of the data. Because of the semiparametric nature of Cox regression, cross-validation is not clearly defined. cv.ncvsurv uses the approach of calculating the full Cox partial likelihood using the cross-validated set of linear predictors. Unfortunately, using this approach there is no clear way (yet) of determining standard errors, so cv.ncvsurv, unlike cv.ncvreg, does not provide any.

Other approaches to cross-validation for the Cox regression model have been proposed; the strenghts and weaknesses of the various methods for penalized regression in the Cox model are not well understood. Because of this, the approach used by cv.ncvsurv may change in the future as additional research is carried out.

References

Breheny P and Huang J. (2011) Coordinate descentalgorithms for nonconvex penalized regression, with applications to biological feature selection. Annals of Applied Statistics, 5: 232-253. myweb.uiowa.edu/pbreheny/publications/Breheny2011.pdf
Verweij PJ and van Houwelingen HC. (1993) Cross-validation in survival analysis. Statistics in Medicine, 12: 2305-2314.

Examples

Run this code

data(Lung)
X <- Lung$X
y <- Lung$y

cvfit <- cv.ncvsurv(X, y)
summary(cvfit)
plot(cvfit)
plot(cvfit, type="rsq")

## requires loading the parallel package
## Not run: 
# library(parallel)
# cl <- makeCluster(4)
# cvfit <- cv.ncvsurv(X, y, cluster=cl)## End(Not run)

Run the code above in your browser using DataLab