Learn R Programming

emil (version 2.2.3)

resample: Resampling schemes

Description

Performance evaluation and parameter tuning use resampling methods to estimate the performance of models. These are defined by resampling schemes, which are data frames where each column corresponds to a division of the data set into mutually exclusive training and test sets. Repeated hold out and cross-validation are two methods to create such schemes.

Usage

resample(method, y, ..., subset = TRUE)

resample_holdout(y, test_fraction = 0.5, nfold = 5, balanced = is.factor(y), subset)

resample_crossvalidation(y, nfold = 5, nrepeat = 5, balanced = is.factor(y), subset)

resample_bootstrap(y, nfold = 10, fit_fraction = if (replace) 1 else 0.632, replace = TRUE, balanced = is.factor(y), subset)

Arguments

Value

A data frame defining a resampling scheme. TRUE or a positive integer codes for training set and FALSE or 0 codes for test set. Positive integers > 1 code for multiple copies of an observation in the training set. NA codes for neither training nor test set and is used to exclude observations from the analysis altogether.

Details

Note that when setting up analyzes, the user should not call resample_holdout or resample_crossvalidation directly, as resample performs additional necessary processing of the scheme.

Resampling scheme can be visualized in a human digestible form with the image function.

Functions for generating custom resampling schemes should be implemented as follows and then called by resample("myMethod", ...):

resample_myMethod <- function(y, ..., subset) [object Object],[object Object],[object Object] The function should return a list of the following elements: [object Object],[object Object]

See Also

emil, subresample, image.resample, index_fit

Examples

Run this code
resample("holdout", 1:50, test_fraction=1/3)
resample("holdout", factor(runif(60) >= .5))
y <- factor(runif(60) >= .5)
cv <- resample("crossvalidation", y)
image(cv, main="Cross-validation scheme")

Run the code above in your browser using DataLab