Learn R Programming

mixOmics (version 2.9-2)

valid: Compute validation criterion for PLS and sparse PLS

Description

Function to estimate the mean squared error of prediction (MSEP), root mean squared error of prediction (RMSEP) and $R^2$ for fitted PLS and sPLS regression models. M-fold and leave-one-out cross-validation are implemented.

Usage

valid(X, Y, ncomp = min(6, ncol(X)), 
      mode = c("regression", "invariant", "classic"),
      method = c("pls", "spls"),
      keepX = if(method == "pls") NULL else c(rep(ncol(X), ncomp)),
      keepY = if(method == "pls") NULL else c(rep(ncol(Y), ncomp)),
      validation = c("loo", "Mfold"),
      M = if(validation == "Mfold") 10 else nrow(X),
      max.iter = 500, 
      tol = 1e-06,
      na.action = c("omit", "predict"),
      predict.par = NULL)

Arguments

X
numeric matrix of predictors. NAs are allowed.
Y
numeric vector or matrix of responses (for multi-response models). NAs are allowed.
ncomp
the number of components to include in the model. Default is from one to min(6, ncol(X).
mode
character string. What type of algorithm to use, matching one of "classic", "invariant" or "regression".
method
Choose between pls or spls.
keepX
if method="spls" numeric vector of length ncomp, the number of variables weights to keep in $X$-loadings. By default all variables are kept in the model.
keepY
if method="spls" numeric vector of length ncomp, the number of variables weights to keep in $Y$-loadings. By default all variables are kept in the model.
validation
character. What kind of (internal) validation to use. See below.
M
the number of folds in the Mfold cross-validation.
max.iter
integer, the maximum number of iterations.
tol
a not negative real, the tolerance used in the iterative algorithm.
na.action
action determining what should be done with missing values in X. One of "predict" or "omit" (see Details).
predict.par
further arguments sent to nipals function.

Value

  • valid produces a list with the following components:
  • msepMean Square Error Prediction for each Y variable.
  • rmsepRoot Mean Square Error Prediction for each Y variable.
  • r2a matrix of $R^2$ values of the $Y$-variables for models with $1, ... ,\code{ncomp}$ components.

encoding

latin1

Details

If na.action = "predict" the estimation of the missing values is performed by the reconstitution of the data matrix using the nipals function. Otherwise, missing values are handled by deletion of incomplete cases. The validation criterion "MSEP", "RMSEP" or "R2" allows one to assess the predictive validity of the model using M-fold or leave-one-out cross-validation. Note that only the classic, regression and invariant modes can be applied. If validation = "Mfold", M-fold cross-validation is performed. How many folds to generate is selected by specifying the number of folds in M. If validation = "loo", leave-one-out cross-validation is performed.

References

Tenenhaus, M. (1998). La r�gression PLS: th�orie et pratique. Paris: Editions Technic. L� Cao, K. A., Rossouw D., Robert-Grani�, C. and Besse, P. (2008). A sparse PLS for variable selection when integrating Omics data. Statistical Applications in Genetics and Molecular Biology 7, article 35. Mevik, B.-H., Cederkvist, H. R. (2004). Mean Squared Error of Prediction (MSEP) Estimates for Principal Component Regression (PCR) and Partial Least Squares Regression (PLSR). Journal of Chemometrics 18(9), 422-429.

See Also

predict, nipals, code{plot.valid}.

Examples

Run this code
data(liver.toxicity)
X <- liver.toxicity$gene
Y <- liver.toxicity$clinic

liver.val <- valid(X, Y, ncomp = 5, mode = "regression", 
                   method = "pls", validation = "loo")

Run the code above in your browser using DataLab