modelr (version 0.1.11)

crossv_mc: Generate test-training pairs for cross-validation

Description

crossv_kfold splits the data into k exclusive partitions, and uses each partition for a test-training split. crossv_mc generates n random partitions, holding out test of the data for training. crossv_loo performs leave-one-out cross-validation, i.e., n = nrow(data) training partitions containing n - 1 rows each.

Usage

crossv_mc(data, n, test = 0.2, id = ".id")

crossv_kfold(data, k = 5, id = ".id")

crossv_loo(data, id = ".id")

Value

A data frame with columns test, train, and .id. test and train are list-columns containing resample() objects. The number of rows is n for crossv_mc(), k for crossv_kfold()

and nrow(data) for crossv_loo().

Arguments

data

A data frame

n

Number of test-training pairs to generate (an integer).

test

Proportion of observations that should be held out for testing (a double).

id

Name of variable that gives each model a unique integer id.

k

Number of folds (an integer).

Examples

Run this code
cv1 <- crossv_kfold(mtcars, 5)
cv1

library(purrr)
cv2 <- crossv_mc(mtcars, 100)
models <- map(cv2$train, ~ lm(mpg ~ wt, data = .))
errs <- map2_dbl(models, cv2$test, rmse)
hist(errs)

Run the code above in your browser using DataCamp Workspace