mvrCv: Cross-validation

Description

Performs the cross-validation calculations for mvr.

Usage

mvrCv(X, Y, ncomp,
      method = c("kernelpls", "simpls", "oscorespls", "svdpc"), scale = FALSE,
      segments = 10, segment.type = c("random", "consecutive", "interleaved"),
      length.seg, trace = FALSE, ...)

Arguments

a matrix of observations. NAs and Infs are not allowed.

a vector or matrix of responses. NAs and Infs are not allowed.

ncomp

the number of components to be used in the modelling.

method

the multivariate regression method to be used.

scale

logical. If TRUE, the learning $X$ data for each segment is scaled by dividing each variable by its sample standard deviation. The prediction data is scaled by the same amount.

segments

the number of segments to use, or a list with segments (see below).

segment.type

the type of segments to use. Ignored if segments is a list.

length.seg

Positive integer. The length of the segments to use. If specified, it overrides segments unless segments is a list.

trace

logical; if TRUE, the segment number is printed for each segment.

...

additional arguments, sent to the underlying fit function.

Value

A list with the following components:
methodeuqals "CV" for cross-validation.
predan array with the cross-validated predictions.
MSEP0a vector of MSEP values (one for each response variable) for a model with zero components, i.e., only the intercept.
MSEPa matrix of MSEP values for models with 1, ..., ncomp components. Each row corresponds to one response variable.
adja matrix of adjustment values for calculating bias corrected MSEP. MSEP uses this.
R2a matrix of R2 values for models with 1, ..., ncomp components. Each row corresponds to one response variable.
segmentsthe list of segments used in the cross-validation.
ncompthe actual number of components used.

encoding

latin1

Details

This function is not meant to be called directly, but through the generic functions pcr, plsr or mvr with the argument validation set to "CV" or "LOO". All arguments to mvrCv can be specified in the generic function call.

If segments is a list, the arguments segment.type and length.seg are ignored. The elements of the list should be integer vectors specifying the indices of the segments. See cvsegments for details.

Otherwise, segments of type segment.type are generated. How many segments to generate is selected by specifying the number of segments in segments, or giving the segment length in length.seg. If both are specified, segments is ignored.

X and Y do not need to be centered.

The R2 component returned is calculated as the squared correlation between the cross-validated predictions and the responses. Note that this function cannot be used in situations where $X$ needs to be recalculated for each segment (except for scaling by the standard deviation), for instance with msc or other preprocessing. For such models, use the more general (but slower) function crossval.

Also note that if needed, the function will silently(!) reduce ncomp to the maximal number of components that can be cross-validated, which is $n - l - 1$, where $n$ is the number of observations and $l$ is the length of the longest segment. The (possibly reduced) number of components is returned as the component ncomp.

References

Mevik, B.-H., Cederkvist, H. R. (2004) Mean Squared Error of Prediction (MSEP) Estimates for Principal Component Regression (PCR) and Partial Least Squares Regression (PLSR). Journal of Chemometrics, 18(9), 422--429.

Examples

Run this code

data(NIR)
NIR.pcr <- pcr(y ~ X, 6, data = NIR, validation = "CV", segments = 10)
plot(MSEP(NIR.pcr))

Run the code above in your browser using DataLab