Simple auxiliary function for randomly generating the indices for training, validation and test data for cross validation.
random.CVind(n, ncmb, nval, CV)Number of observations (rows).
Number of training samples for the SingBoost models in CMB. Must be an integer between 1 and \(n\).
Number of validation samples in the CMB aggregation procedure. Must be an integer between 1 and \(n-n_{cmb}-1\).
Number of cross validation steps. Must be a positive integer.
List of row indices for training, validation and test data for each cross validation loop.
The data set consists of $n$ observations. \(n_{cmb}\) of them are used for the CMB aggregation procedure. Note that within CMB itself, only a subset of these observations may be used for SingBoost training. The Stability Selection is based on the validation set consisting of \(n_{val}\) observations. The cross-validated loss of the final model is evaluated on the test data set with \(n-n_{cmb}-n_{val}\) observations. Clearly, all data sets need to be disjoint.