llama (version 0.9.2)

cvFolds: Cross-validation folds

Description

Take data produced by input and amend it with (optionally) stratified folds for cross-validation.

Usage

cvFolds(data, nfolds = 10L, stratify = FALSE)

Arguments

data

the data to use. The structure returned by input.

nfolds

the number of folds. Defaults to 10. If -1 is given, leave-one-out cross-validation folds are produced.

stratify

whether to stratify the folds. Makes really only sense for classification models. Defaults to FALSE.

Value

train

a list of index sets for training.

test

a list of index sets for testing.

the original members of data. See input.

Details

Partitions the data set into folds. Stratification, if requested, is done by the best algorithm, i.e. the one with the best performance. The distribution of the best algorithms in each fold will be approximately the same. The folds are assembled into training and test sets by combining $n-1$ folds for training and using the remaining fold for testing. The sets of indices are added to the original data set and returned.

If the data set has train and test partitions already, they are overwritten.

See Also

bsFolds, trainTest

Examples

Run this code
# NOT RUN {
data(satsolvers)
folds = cvFolds(satsolvers)

# use 5 folds instead of the default 10
folds5 = cvFolds(satsolvers, 5L)

# stratify
foldsU = cvFolds(satsolvers, stratify=TRUE)
# }

Run the code above in your browser using DataCamp Workspace