Learn R Programming

mpath (version 0.1-20)

cv.zipath: Cross-validation for zipath

Description

Does k-fold cross-validation for zipath, produces a plot, and returns cross-validated log-likelihood values for lambda

Usage

cv.zipath(formula, data, weights, nlambda=100, lambda.count=NULL, lambda.zero=NULL,
nfolds=10, foldid, plot.it=TRUE, se=TRUE, trace=FALSE,...)
## S3 method for class 'cv.zipath':
coef(object, which=object$lambda.which, model = c("full", "count", "zero"), ...)

Arguments

formula
symbolic description of the model
data
arguments controlling formula processing via model.frame.
weights
Observation weights; defaults to 1 per observation
nlambda
number of lambda value, default value is 10.
lambda.count
Optional user-supplied lambda.count sequence; default is NULL
lambda.zero
Optional user-supplied lambda.zero sequence; default is NULL
nfolds
number of folds >=3, default is 10
foldid
an optional vector of values between 1 and nfold identifying what fold each observation is in. If supplied, nfold can be missing and will be ignored.
plot.it
a logical value, to plot the estimated log-likelihood values if TRUE.
se
a logical value, to plot with standard errors.
trace
if TRUE, shows cross-validation progress
...
Other arguments that can be passed to zipath.
object
object of class cv.zipath.
which
Indices of the pair of penalty parameters lambda.count and lambda.zero at which estimates are extracted. By default, the one which generates the optimal cross-validation value.
model
character specifying for which component of the model the estimated coefficients should be extracted.

Value

  • an object of class "cv.zipath" is returned, which is a list with the components of the cross-validation fit.
  • fita fitted zipath object for the full data.
  • residmatmatrix for cross-validated log-likelihood at each (count.lambda, zero.lambda) sequence
  • bicmatrix of BIC values with row values for lambda and column values for kth cross-validation
  • fractiona sequence from 1:nlambda. nlambda is the same as the argument if any one of (count.lambda, zero.lambda) is missing; otherwise nlambda=length(count.lambda)
  • cvThe mean cross-validated log-likelihood - a vector of length length(count.lambda).
  • cv.errorestimate of standard error of cv.
  • foldidan optional vector of values between 1 and nfold identifying what fold each observation is in.
  • lambda.whichindex of (count.lambda, zero.lambda) that gives maximum cv.
  • lambda.optimvalue of (count.lambda, zero.lambda) that gives maximum cv.

Details

The function runs zipath nfolds+1 times; the first to compute the (lambda.count, lambda.zero) sequence, and then to compute the fit with each of the folds omitted. The log-likelihood value is accumulated, and the average value and standard deviation over the folds is computed. Note that cv.zipath can be used to search for values for count.alpha or zero.alpha: it is required to call cv.zipath with a fixed vector foldid for different values of count.alpha or zero.alpha.

The method for coef by default return a single vector of coefficients, i.e., all coefficients are concatenated. By setting the model argument, the estimates for the corresponding model components can be extracted.

References

Zhu Wang, Shuangge Ma, Michael Zappitelli, Chirag Parikh, Ching-Yun Wang and Prasad Devarajan (2014) Penalized Count Data Regression with Application to Hospital Stay after Pediatric Cardiac Surgery, Statistical Methods in Medical Research. 2014 Apr 17. [Epub ahead of print]

Zhu Wang, Shuangge Ma, Ching-Yun Wang, Michael Zappitelli, Prasad Devarajan and Chirag R. Parikh (2014) EM for Regularized Zero Inflated Regression Models with Applications to Postoperative Morbidity after Cardiac Surgery in Children, Statistics in Medicine. 33(29):5192-208.

Zhu Wang, Shuangge Ma and Ching-Yun Wang (2015) Variable selection for zero-inflated and overdispersed data with application to health care demand in Germany, Biometrical Journal. 57(5):867-84.

See Also

zipath and plot, predict, and coef methods for "cv.zipath" object.

Examples

Run this code
data("bioChemists", package = "pscl")
fm_zip <- cv.zipath(art ~ . | ., data = bioChemists, family = "poisson", nlambda=10)
coef(fm_zip)
### prediction from the best model
fm_zip_predict <- predict(object=fm_zip$fit, which=fm_zip$lambda.which, type="response", 
model=c("full"))
fm_znb <- cv.zipath(art ~ . | ., data = bioChemists, family = "negbin", nlambda=10)
coef(fm_znb)

Run the code above in your browser using DataLab