Learn R Programming

pcv (version 1.1.0)

pcvcrossval: Generate sequence of indices for cross-validation

Description

Generates and returns sequence of object indices for each segment in random segmented cross-validation

Usage

pcvcrossval(cv = 1, nobj = NULL, resp = NULL)

Value

vector with object indices for each segment

Arguments

cv

cross-validation settings, can be a number, a list or a vector with integers.

nobj

number of objects in a dataset

resp

vector or matrix with response values to use in case of venetian blinds

Details

Parameter `cv` defines how to split the rows of the training set. The split is similar to cross-validation splits, as PCV is based on cross-validation. This parameter can have the following values:

1. A number (e.g. `cv = 4`). In this case this number specifies number of segments for random splits, except `cv = 1` which is a special case for leave-one-out (full cross-validation).

2. A list with 2 values: `list("name", nseg)`. In this case `"name"` defines the way to make the split, you can select one of the following: `"loo"` for leave-one-out, `"rand"` for random splits or `"ven"` for Venetian blinds (systematic) splits. The second parameter, `nseg`, is a number of segments for splitting the rows into. For example, `cv = list("ven", 4)`, shown in the code examples above, tells PCV to use Venetian blinds splits with 4 segments.

3. A vector with integer numbers, e.g. `cv = c(1, 2, 3, 1, 2, 3, 1, 2, 3)`. In this case number of values in this vector must be the same as number of rows in the training set. The values specify which segment a particular row will belong to. In case of the example shown here, it is assumed that you have 9 rows in the calibration set, which will be split into 3 segments. The first segment will consist of measurements from rows 1, 4 and 7.