Learn R Programming

mdatools (version 0.5.3)

pls: Partial Least Squares regression

Description

pls is used to calibrate, validate and use of partial least squares (PLS) regression model.

Usage

pls(x, y, ncomp = 15, center = T, scale = F, cv = NULL,
    x.test = NULL, y.test = NULL, method = 'simpls', alpha = 0.05, 
    coeffs.ci = NULL, coeffs.alpha = 0.1, info = '')

Arguments

x
matrix with predictors.
y
matrix with responses.
ncomp
maximum number of components to calculate.
center
logical, center or not predictors and response values.
scale
logical, scale (standardize) or not predictors and response values.
cv
number of segments for cross-validation (if cv = 1, full cross-validation will be used).
x.test
matrix with predictors for test set.
y.test
matrix with responses for test set.
method
method for calculating PLS model.
alpha
significance level for calculating statistical limits for residuals.
coeffs.ci
method to calculate p-values and confidence intervals for regression coefficients (so far only jack-knifing is availavle: ='jk').
coeffs.alpha
significance level for calculating confidence intervals for regression coefficients.
info
short text with information about the model.

Value

  • Returns an object of pls class with following fields:
  • ncompnumber of components included to the model.
  • ncomp.selectedselected (optimal) number of components.
  • xloadingsmatrix with loading values for x decomposition.
  • yloadingsmatrix with loading values for y decomposition.
  • weightsmatrix with PLS weights.
  • selratioarray with selectivity ratio values.
  • vipscoresmatrix with VIP scores values.
  • coeffsobject of class regcoeffs with regression coefficients calculated for each component.
  • infoinformation about the model, provided by user when build the model.
  • calresan object of class plsres with PLS results for a calibration data.
  • testresan object of class plsres with PLS results for a test data, if it was provided.
  • cvresan object of class plsres with PLS results for cross-validation, if this option was chosen.

Details

So far only SIMPLS method [1] is available, more coming soon. Implementation works both with one and multiple response variables.

Like in pca, pls uses number of components (ncomp) as a minimum of number of objects - 1, number of x variables and the default or provided value. Regression coefficients, predictions and other results are calculated for each set of components from 1 to ncomp: 1, 1:2, 1:3, etc. Besides that, there is also a function (selectCompNum.pls) for selecting an optimal number of components in a model (ncomp.selected). The selected optimal number of components is used for all default operations - predictions, plots, etc.

Selectivity ratio [2] and VIP scores [3] are calculated for any PLS model authomatically, however while selectivity ratio values are calculated for all computed components, the VIP scores are computed only for selected components (to save calculation time) and recalculated every time when selectCompNum() is called for the model.

Calculation of confidence intervals and p-values for regression coefficients are available only by jack-knifing so far. See help for regcoeffs objects for details.

References

1. S. de Jong, Chemometrics and Intelligent Laboratory Systems 18 (1993) 251-263.

2. Tarja Rajalahti et al. Chemometrics and Laboratory Systems, 95 (2009), 35-48.

3. Il-Gyo Chong, Chi-Hyuck Jun. Chemometrics and Laboratory Systems, 78 (2005), 103-112.

See Also

Methods for pls objects: ll{ print prints information about a pls object. summary.pls shows performance statistics for the model. plot.pls shows plot overview of the model. pls.simpls implementation of SIMPLS algorithm. predict.pls applies PLS model to a new data. selectCompNum.pls set number of optimal components in the model. plotPredictions.pls shows predicted vs. measured plot. plotRegcoeffs.pls shows regression coefficients plot. plotXScores.pls shows scores plot for x decomposition. plotXYScores.pls shows scores plot for x and y decomposition. plotXLoadings.pls shows loadings plot for x decomposition. plotXYLoadings.pls shows loadings plot for x and y decomposition. plotRMSE.pls shows RMSE plot. plotXVariance.pls shows explained variance plot for x decomposition. plotYVariance.pls shows explained variance plot for y decomposition. plotXCumVariance.pls shows cumulative explained variance plot for y decomposition. plotYCumVariance.pls shows cumulative explained variance plot for y decomposition. plotXResiduals.pls shows T2 vs. Q2 plot for x decomposition. plotYResiduals.pls shows residuals plot for y values. plotSelectivityRatio.pls shows plot with selectivity ratio values. plotVIPScores.pls shows plot with VIP scores values. getSelectivityRatio.pls returns vector with selectivity ratio values. getVIPScores.pls returns vector with VIP scores values. } Most of the methods for plotting data (except loadings and regression coefficients) are also available for PLS results (plsres) objects.

Examples

Run this code
### Examples of using PLS model class

## 1. Make a PLS model for concentration of first component 
## using full-cross validation and show overview

data(simdata)
x = simdata$spectra.c
y = simdata$conc.c[, 1]

model = pls(x, y, ncomp = 8, cv = 1)
model = selectCompNum(model, 2)
summary(model)
plot(model)

## 2. Make a PLS model for concentration of first component 
## using test set and 10 segment cross-validation and show overview

data(simdata)
x = simdata$spectra.c
y = simdata$conc.c[, 1]
x.t = simdata$spectra.t
y.t = simdata$conc.t[, 1]

model = pls(x, y, ncomp = 8, cv = 10, x.test = x.t, y.test = y.t)
model = selectCompNum(model, 2)
summary(model)
plot(model)

## 3. Make a PLS model for concentration of first component 
## using only test set validation and show overview

data(simdata)
x = simdata$spectra.c
y = simdata$conc.c[, 1]
x.t = simdata$spectra.t
y.t = simdata$conc.t[, 1]

model = pls(x, y, ncomp = 6, x.test = x.t, y.test = y.t)
model = selectCompNum(model, 2)
summary(model)
plot(model)

## 4. Show variance and error plots for a PLS model
par(mfrow = c(2, 2))
plotXCumVariance(model, type = 'h')
plotYCumVariance(model, type = 'b', show.labels = TRUE, legend.position = 'bottomright')
plotRMSE(model)
plotRMSE(model, type = 'h', show.labels = TRUE)
par(mfrow = c(1, 1))

## 5. Show scores plots for a PLS model
par(mfrow = c(2, 2))
plotXScores(model)
plotXScores(model, comp = c(1, 3), show.labels = TRUE)
plotXYScores(model)
plotXYScores(model, comp = 2, show.labels = TRUE)
par(mfrow = c(1, 1))

## 6. Show loadings and coefficients plots for a PLS model
par(mfrow = c(2, 2))
plotXLoadings(model)
plotXLoadings(model, comp = c(1, 2), type = 'l')
plotXYLoadings(model, comp = c(1, 2), legend.position = 'topleft')
plotRegcoeffs(model)
par(mfrow = c(1, 1))

## 7. Show predictions and residuals plots for a PLS model
par(mfrow = c(2, 2))
plotXResiduals(model, show.label = TRUE)
plotYResiduals(model, show.label = TRUE)
plotPredictions(model)
plotPredictions(model, ncomp = 4, xlab = 'C, reference', ylab = 'C, predictions')
par(mfrow = c(1, 1))

## 8. Selectivity ratio and VIP scores plots
par(mfrow = c(2, 2))
plotSelectivityRatio(model)
plotSelectivityRatio(model, ncomp = 1)
par(mfrow = c(1, 1))

## 9. Variable selection with selectivity ratio
selratio = getSelectivityRatio(model)
selvar = !(selratio < 8)

xsel = x[, selvar]
modelsel = pls(xsel, y, ncomp = 6, cv = 1)
modelsel = selectCompNum(modelsel, 3)

summary(model)
summary(modelsel)

## 10. Calculate average spectrum and show the selected variables
i = 1:ncol(x)
ms = apply(x, 2, mean)

par(mfrow = c(2, 2))

plot(i, ms, type = 'p', pch = 16, col = 'red', main = 'Original variables')
plotPredictions(model)

plot(i, ms, type = 'p', pch = 16, col = 'lightgray', main = 'Selected variables')
points(i[selvar], ms[selvar], col = 'red', pch = 16)
plotPredictions(modelsel)

par(mfrow = c(1, 1))

Run the code above in your browser using DataLab