Performs k-fold cross validation for penalized Cox regression models with grouped covariates over a grid of values for the regularization parameter lambda.
cv.grpsurv(X, y, group, ..., nfolds=10, seed, cv.ind, returnY=FALSE,
trace=FALSE)
The design matrix, as in grpsurv
.
The response matrix, as in grpsurv
.
The grouping vector, as in grpsurv
.
Additional arguments to grpsurv
.
The number of cross-validation folds. Default is 10.
You may set the seed of the random number generator in order to obtain reproducible results.
Which fold each observation belongs to. By default the
observations are randomly assigned by cv.grpsurv
.
Should cv.grpsurv
return the linear predictors
from the cross-validation folds? Default is FALSE; if TRUE, this
will return a matrix in which the element for row i, column j is the
fitted value for observation i from the fold in which observation i
was excluded from the fit, at the jth value of lambda. NOTE: The
rows of Y
are ordered by time on study, and therefore do not
correspond to the original order of observations pased to
cv.grpsurv
.
If set to TRUE, cv.grpsurv will inform the user of its progress by announcing the beginning of each CV fold. Default is FALSE.
An object with S3 class "cv.grpsurv"
inheriting from
"cv.grpreg"
and containing:
The error for each value of lambda
, averaged
across the cross-validation folds.
The sequence of regularization parameter values along which the cross-validation error was calculated.
The fitted grpsurv
object for the whole data.
The index of lambda
corresponding to
lambda.min
.
The value of lambda
with the minimum
cross-validation error.
The cross-validated deviance for the first model along
the grid of lambda
(i.e., the cross-validated deviance for
max(lambda), unless you have supplied your own lambda
sequence, in which case this quantity is probably not meaningful).
Although the actual null deviance can be calculated, it cannot be
compared with the cross-validated deviance due to the manner in
which deviance must be calculated for Cox regression models (see
details).
The function calls grpsurv
nfolds
times, each time
leaving out 1/nfolds
of the data. Because of the
semiparametric nature of Cox regression, cross-validation is not
clearly defined. cv.grpsurv
uses the approach of calculating
the full Cox partial likelihood using the cross-validated set of
linear predictors. Unfortunately, using this approach there is no
clear way (yet) of determining standard errors, so cv.grpsurv
,
unlike cv.grpreg
, does not provide any.
Other approaches to cross-validation for the Cox regression model have
been proposed; the strenghts and weaknesses of the various methods for
penalized regression in the Cox model are not well understood.
Because of this, the approach used by cv.grpsurv
may change in
the future as additional research is carried out.
Verweij PJ and van Houwelingen HC. (1993) Cross-validation in survival analysis. Statistics in Medicine, 12: 2305-2314.
data(Lung)
X <- Lung$X
y <- Lung$y
group <- Lung$group
cvfit <- cv.grpsurv(X, y, group)
plot(cvfit)
coef(cvfit)
plot(cvfit$fit)
plot(cvfit, type="rsq")
Run the code above in your browser using DataLab