cv-indices: Create cross-validation folds

Description

These are helper functions to create cross-validation (CV) folds, i.e., to split up the indices from 1 to n into K subsets ("folds") for \(K\)-fold CV. These functions are potentially useful when creating the cvfits and cvfun arguments for init_refmodel(). The return value is different for these two methods, see below for details.

Usage

cvfolds(n, K, seed = sample.int(.Machine$integer.max, 1))
cv_ids(
  n,
  K,
  out = c("foldwise", "indices"),
  seed = sample.int(.Machine$integer.max, 1)
)

Value

cvfolds() returns a vector of length n such that each element is an integer between 1 and K denoting which fold the corresponding data point belongs to. The return value of cv_ids() depends on the out

argument. If out = "foldwise", the return value is a list with K

elements, each being a list with elements tr and ts giving the training and test indices, respectively, for the corresponding fold. If out = "indices", the return value is a list with elements tr and ts

each being a list with K elements giving the training and test indices, respectively, for each fold.

Arguments

n: Number of observations.
K: Number of folds. Must be at least 2 and not exceed n.
seed: Pseudorandom number generation (PRNG) seed by which the same results can be obtained again if needed. Passed to argument seed of set.seed(), but can also be NA to not call set.seed() at all.
out: Format of the output, either "foldwise" or "indices". See below for details.

Examples

Run this code

n <- 100
set.seed(1234)
y <- rnorm(n)
cv <- cv_ids(n, K = 5, seed = 9876)
# Mean within the test set of each fold:
cvmeans <- sapply(cv, function(fold) mean(y[fold$ts]))

Run the code above in your browser using DataLab