Performs k-fold cross validation for penalized regression models with grouped covariates over a grid of values for the regularization parameter lambda.
cv.grpreg(X, y, group=1:ncol(X), ..., nfolds=10, seed, cv.ind,
returnY=FALSE, trace=FALSE)
The design matrix, as in grpreg
.
The response vector (or matrix), as in grpreg
.
The grouping vector, as in grpreg
.
Additional arguments to grpreg
.
The number of cross-validation folds. Default is 10.
You may set the seed of the random number generator in order to obtain reproducible results.
Which fold each observation belongs to. By default the
observations are randomly assigned by cv.grpreg
.
Should cv.grpreg
return the fitted values from
the cross-validation folds? Default is FALSE; if TRUE, this will
return a matrix in which the element for row i, column j is the
fitted value for observation i from the fold in which observation i
was excluded from the fit, at the jth value of lambda.
If set to TRUE, cv.grpreg will inform the user of its progress by announcing the beginning of each CV fold. Default is FALSE.
An object with S3 class "cv.grpreg"
containing:
The error for each value of lambda
, averaged
across the cross-validation folds.
The estimated standard error associated with each value of
for cve
.
The sequence of regularization parameter values along which the cross-validation error was calculated.
The fitted grpreg
object for the whole data.
The index of lambda
corresponding to
lambda.min
.
The value of lambda
with the minimum
cross-validation error.
The deviance for the intercept-only model.
If family="binomial"
, the cross-validation prediction
error for each value of lambda
.
The function calls grpreg
nfolds
times, each time
leaving out 1/nfolds
of the data. The cross-validation
error is based on the residual sum of squares when
family="gaussian"
and the deviance when
family="binomial"
or family="poisson"
.
For Gaussian and Poisson responses, the folds are chosen according to
simple random sampling. For binomial responses, the numbers for each
outcome class are balanced across the folds; i.e., the number of
outcomes in which y
is equal to 1 is the same for each fold, or
possibly off by 1 if the numbers do not divide evenly.
As in grpreg
, seemingly unrelated regressions/multitask
learning can be carried out by setting y
to be a matrix, in
which case groups are set up automatically (see grpreg
for details), and cross-validation is carried out with respect to rows
of y
. As mentioned in the details there, it is recommended to
standardize the responses prior to fitting.
grpreg
, plot.cv.grpreg
,
summary.cv.grpreg
, predict.cv.grpreg
data(Birthwt)
X <- Birthwt$X
y <- Birthwt$bwt
group <- Birthwt$group
cvfit <- cv.grpreg(X, y, group)
plot(cvfit)
summary(cvfit)
coef(cvfit) ## Beta at minimum CVE
cvfit <- cv.grpreg(X, y, group, penalty="gel")
plot(cvfit)
summary(cvfit)
Run the code above in your browser using DataLab