Learn R Programming

gfboost (version 0.1.1)

random.CVind: Cross validation index generator

Description

Simple auxiliary function for randomly generating the indices for training, validation and test data for cross validation.

Usage

random.CVind(n, ncmb, nval, CV)

Arguments

n

Number of observations (rows).

ncmb

Number of training samples for the SingBoost models in CMB. Must be an integer between 1 and \(n\).

nval

Number of validation samples in the CMB aggregation procedure. Must be an integer between 1 and \(n-n_{cmb}-1\).

CV

Number of cross validation steps. Must be a positive integer.

Value

CVind

List of row indices for training, validation and test data for each cross validation loop.

Details

The data set consists of $n$ observations. \(n_{cmb}\) of them are used for the CMB aggregation procedure. Note that within CMB itself, only a subset of these observations may be used for SingBoost training. The Stability Selection is based on the validation set consisting of \(n_{val}\) observations. The cross-validated loss of the final model is evaluated on the test data set with \(n-n_{cmb}-n_{val}\) observations. Clearly, all data sets need to be disjoint.