The functions fit PLSR or PCR models with 1, $\ldots$,
ncomp
number of components. Multi-response models are fully
supported. The formula
argument should be a symbolic formula of the form
response ~ terms
, where response
is the name of the
response vector or matrix (for multi-response models) and terms
is the name of one or more predictor matrices, usually separated by
+
, e.g., water ~ FTIR
or y ~ X + Z
. See
lm
for a detailed description. The named
variables should exist in the supplied data
data frame or in
the global environment. Note: Do not use mvr(mydata$y ~
mydata$X, ...)
, instead use mvr(y ~ X, data = mydata,
...)
. Otherwise, predict.mvr
will not work properly.
The chapter Statistical models in R of the manual An
Introduction to R distributed with Ris a good reference on
formulas in R.
Three PLSR algorithms are available: the kernel algorithm, SIMPLS and
the classical orthogonal scores algorithm. One PCR algorithm is
available: using the singular value decomposition. The type of
regression is specified with the method
argument. pcr
and plsr
are wrappers for mvr
, with different values for method
.
If validation = "CV"
, cross-validation is performed. The number and
type of cross-validation segments are specified with the arguments
segments
and segment.type
. See mvrCv
for
details. If validation = "LOO"
, leave-one-out cross-validation
is performed. It is an error to specify the segments when
validation = "LOO"
is specified.
Note that the cross-validation is optimised for speed, and some
generality has been sacrificed. Especially, the model matrix is
calculated only once for the complete cross-validation, so models like
y ~ msc(X)
will not be properly cross-validated. However,
scaling requested by scale = TRUE
is properly cross-validated.
For proper cross-validation of models where the model matrix must be
updated/regenerated for each segment, use the separate function
crossval
.