crossvaldata: Computes k-fold cross validation for rminer models.

Description

Computes k-fold cross validation for rminer models.

Usage

crossvaldata(x, data, theta.fit, theta.predict, ngroup = 10, 
             mode = "stratified", seed = NULL, model, task, feature = "none",
             ...)

Value

Returns a list with:

$cv.fit -- all predictions (factor if task="class", matrix if task="prob" or numeric if task="reg");
$model -- vector list with the model for each fold.
$mpar -- vector list with the mpar for each fold;
$attributes -- the selected attributes for each fold if a feature selection algorithm was adopted;
$ngroup -- the number of folds;
$leave.out -- the computed size for each fold (=nrow(data)/ngroup);
$groups -- vector list with the indexes of each group;
$call -- the call of this function;

Arguments

x: See fit for details.
data: See fit for details.
theta.fit: fitting function
theta.predict: prediction function
ngroup: number of folds
mode: Possibilities are: "stratified", "random" or "order" (see holdout for details).
seed: if NULL then no seed is used and the current R randomness is assumed; else a fixed seed is adopted to generate local random sample sequences, returning always the same result for the same seed (local means that it does not affect the state of other random number generations called after this function, see holdout example).
model: See fit for details.
task: See fit for details.
feature: See fit for details.
...: Additional parameters sent to theta.fit or theta.predic (e.g. search)

Author

This function was adapted by Paulo Cortez from the crossval function of the bootstrap library (S original by R. Tibshirani and R port by F. Leisch).

Details

Standard k-fold cross-validation adopted for rminer models. By default, for classification tasks ("class" or "prob") a stratified sampling is used (the class distributions are identical for each fold), unless mode is set to random or order (see holdout for details).

References

Check the crossval function.

Examples

Run this code

### dontrun is used when the execution of the example requires some computational effort.
if (FALSE) {
 data(iris)
 # 3-fold cross validation using fit and predict
 # the control argument is sent to rpart function 
 # rpart.control() is from the rpart package
 M=crossvaldata(Species~.,iris,fit,predict,ngroup=3,seed=12345,model="rpart",
                task="prob", control = rpart::rpart.control(cp=0.05))
 print("cross validation object:")
 print(M)
 C=mmetric(iris$Species,M$cv.fit,metric="CONF")
 print("confusion matrix:")
 print(C)
}

Run the code above in your browser using DataLab