CVcluster: Cross-validation estimate of predictive accuracy for clustered data

Description

This function adapts cross-validation to work with clustered categorical outcome data. For example, there may be multiple observations on individuals (clusters). It requires a fitting function that accepts a model formula.

Usage

CVcluster(formula, id, data, na.action=na.omit, nfold = 15, FUN = lda,
              predictFUN=function(x, newdata, ...)predict(x, newdata, ...)$class,
              printit = TRUE, cvparts = NULL, seed = 29)

Arguments

formula

Model formula

numeric, identifies clusters

data

data frame that supplies the data

na.action

na.fail (default) or na.omit

nfold

Number of cross-validation folds

FUN

function that fits the model

predictFUN

function that gives predicted values

printit

Should summary information be printed?

cvparts

Use, if required, to specify the precise folds used for the cross-validation. The comparison between different models will be more accurate if the same folds are used.

seed

Set seed, if required, so that results are exactly reproducible

Value

classPredicted values from cross-validation
CVaccuracyCross-validation estimate of accuracy
confusionConfusion matrix

References

http:/www.maths.anu.edu.au/~johnm/nzsr/taws.html

Examples

Run this code

if(require(mlbench)&require(MASS)){
data(Vowel)
acc <- CVcluster(formula=Class ~., id = V1, data = Vowel, nfold = 15, FUN = lda,
              predictFUN=function(x, newdata, ...)predict(x, newdata, ...)$class,
                     printit = TRUE, cvparts = NULL, seed = 29)
}

Run the code above in your browser using DataLab