cv.grpsurv: Cross-validation for grpsurv

Description

Performs k-fold cross validation for penalized Cox regression models with grouped covariates over a grid of values for the regularization parameter lambda.

Usage

cv.grpsurv(X, y, group, ..., nfolds=10, seed, cv.ind, returnY=FALSE,
trace=FALSE)

Arguments

The design matrix, as in grpsurv.

The response matrix, as in grpsurv.

group

The grouping vector, as in grpsurv.

...

Additional arguments to grpsurv.

nfolds

The number of cross-validation folds. Default is 10.

seed

You may set the seed of the random number generator in order to obtain reproducible results.

cv.ind

Which fold each observation belongs to. By default the observations are randomly assigned by cv.grpsurv.

returnY

Should cv.grpsurv return the linear predictors from the cross-validation folds? Default is FALSE; if TRUE, this will return a matrix in which the element for row i, column j is the fitted value for observation i from the fold in which observation i was excluded from the fit, at the jth value of lambda. NOTE: The rows of Y are ordered by time on study, and therefore do not correspond to the original order of observations pased to cv.grpsurv.

trace

If set to TRUE, cv.grpsurv will inform the user of its progress by announcing the beginning of each CV fold. Default is FALSE.

Value

An object with S3 class "cv.grpsurv" inheriting from "cv.grpreg" and containing:

cve

The error for each value of lambda, averaged across the cross-validation folds.

lambda

The sequence of regularization parameter values along which the cross-validation error was calculated.

fit

The fitted grpsurv object for the whole data.

min

The index of lambda corresponding to lambda.min.

lambda.min

The value of lambda with the minimum cross-validation error.

null.dev

The cross-validated deviance for the first model along the grid of lambda (i.e., the cross-validated deviance for max(lambda), unless you have supplied your own lambda sequence, in which case this quantity is probably not meaningful). Although the actual null deviance can be calculated, it cannot be compared with the cross-validated deviance due to the manner in which deviance must be calculated for Cox regression models (see details).

Details

The function calls grpsurv nfolds times, each time leaving out 1/nfolds of the data. Because of the semiparametric nature of Cox regression, cross-validation is not clearly defined. cv.grpsurv uses the approach of calculating the full Cox partial likelihood using the cross-validated set of linear predictors. Unfortunately, using this approach there is no clear way (yet) of determining standard errors, so cv.grpsurv, unlike cv.grpreg, does not provide any.

Other approaches to cross-validation for the Cox regression model have been proposed; the strenghts and weaknesses of the various methods for penalized regression in the Cox model are not well understood. Because of this, the approach used by cv.grpsurv may change in the future as additional research is carried out.

References

Verweij PJ and van Houwelingen HC. (1993) Cross-validation in survival analysis. Statistics in Medicine, 12: 2305-2314.

Examples

Run this code

data(Lung)
X <- Lung$X
y <- Lung$y
group <- Lung$group

cvfit <- cv.grpsurv(X, y, group)
plot(cvfit)
coef(cvfit)
plot(cvfit$fit)
plot(cvfit, type="rsq")

Run the code above in your browser using DataLab