Learn R Programming

mpath (version 0.1-20)

cv.glmregNB: Cross-validation for glmregNB

Description

Does k-fold cross-validation for glmregNB, produces a plot, and returns cross-validated log-likelihood values for lambda

Usage

cv.glmregNB(formula, data, weights, lambda=NULL,
nfolds=10, foldid, plot.it=TRUE, se=TRUE, trace=FALSE,...)

Arguments

formula
symbolic description of the model
data
arguments controlling formula processing via model.frame.
weights
Observation weights; defaults to 1 per observation
lambda
Optional user-supplied lambda sequence; default is NULL, and glmregNB chooses its own sequence
nfolds
number of folds - default is 10. Although nfolds can be as large as the sample size (leave-one-out CV), it is not recommended for large datasets. Smallest value allowable is nfolds=3
foldid
an optional vector of values between 1 and nfold identifying what fold each observation is in. If supplied, nfold can be missing.
plot.it
a logical value, to plot the estimated log-likelihood values if TRUE.
se
a logical value, to plot with standard errors.
trace
if TRUE, shows cross-validation progress
...
Other arguments that can be passed to glmregNB.

Value

  • an object of class "cv.glmregNB" is returned, which is a list with the ingredients of the cross-validation fit.
  • fita fitted glmregNB object for the full data.
  • residmatmatrix of log-likelihood values with row values for lambda and column values for kth cross-validation
  • cvThe mean cross-validated log-likelihood values - a vector of length length(lambda).
  • cv.errorThe standard error of cross-validated log-likelihood values - a vector of length length(lambda).
  • fractiona vector of lambda values with length of lambda
  • foldidindicators of data used in each cross-validation, for reproductive purposes
  • lambda.whichindex of lambda that gives maximum cv value.
  • lambda.optimvalue of lambda that gives maximum cv value.

Details

The function runs glmregNB nfolds+1 times; the first to get the lambda sequence, and then the remainder to compute the fit with each of the folds omitted. The error is accumulated, and the average error and standard deviation over the folds is computed. Note that cv.glmregNB does NOT search for values for alpha. A specific value should be supplied, else alpha=1 is assumed by default. If users would like to cross-validate alpha as well, they should call cv.glmregNB with a pre-computed vector foldid, and then use this same fold vector in separate calls to cv.glmregNB with different values of alpha.

References

Zhu Wang, Shuangge Ma, Michael Zappitelli, Chirag Parikh, Ching-Yun Wang and Prasad Devarajan (2014) Penalized Count Data Regression with Application to Hospital Stay after Pediatric Cardiac Surgery, Statistical Methods in Medical Research. 2014 Apr 17. [Epub ahead of print]

See Also

glmregNB and plot, predict, and coef methods for "cv.glmregNB" object.

Examples

Run this code
data("bioChemists", package = "pscl")
fm_nb <- cv.glmregNB(art ~ ., data = bioChemists)
plot(fm_nb)

Run the code above in your browser using DataLab