pls: Partial Least Squares modelling of GEM objects.

Description

The output of GEM is used as input to a PLS classification with the selected effect as response. It is possible to compare two models using the gem2 argument. Variable selection is available through Jackknifing (from package pls) and Shaving (from package plsVarSel).

Usage

pls(gem, ...)
# S3 method for GEM
pls(
  gem,
  effect,
  ncomp,
  newdata = NULL,
  gem2,
  validation,
  jackknife = NULL,
  shave = NULL,
  df.used = gem$df.used,
  ...
)

Value

An object of class GEMpls, mvr, list containing the fitted PLS model, classifications/predictions, data and optionally Jackknife or Shaving results.

Arguments

gem: Object of class GEM.
...: Additional arguments for plsr.
effect: The effect to be used as response.
ncomp: Number of PLS components.
newdata: Optional new data matrix for prediction.
gem2: Second object of class GEM for comparison.
validation: Optional validation parameters for plsr.
jackknife: Optional argument specifying if jackknifing should be applied.
shave: Optional argument indicating if variable shaving should be used. shave should be a list with two elements: the PLS filter method and the proportion to remove. shave = TRUE uses defaults: list("sMC", 0.2).
df.used: Optional argument indicating how many degrees of freedom have been consumed during deflation. Default value from input object.

Details

If using the shave options, the segment type is given as type instead of segment.type (see examples).

Examples

Run this code

data(MS, package = "gemR")
# Subset to reduce runtime in example
MS$proteins <- MS$proteins[,20:70]

gem <- GEM(proteins ~ MS * group, data = MS[-1,])

# Simple PLS using interleaved cross-validation
plsMod <- pls(gem, 'MS', 6, validation = "CV",
              segment.type = "interleaved", length.seg = 25)
plot(plsMod)
scoreplot(plsMod, labels = "names")

# PLS with shaving of variables (mind different variable for cross-validation type)
plsModS <- pls(gem, 'MS', 6, validation = "CV",
              type = "interleaved", length.seg=25, shave = TRUE)
# Error as a function of remaining variables
plot(plsModS)
# Selected variables for minimum error
with(plsModS$shave, colnames(X)[variables[[min.red+1]]])

 # Time consuming due to leave-one-out cross-validation
  plsModJ <- pls(gem, 'MS', 5, validation = "LOO",
              jackknife = TRUE)
  colSums(plsModJ$classes == as.numeric(MS$MS[-1]))
  # Jackknifed coefficient P-values (sorted)
  plot(sort(plsModJ$jack[,1,1]), pch = '.', ylab = 'P-value')
  abline(h=c(0.01,0.05),col=2:3)

  scoreplot(plsModJ)
  scoreplot(plsModJ, comps=c(1,3))   # Selected components
  # Use MS categories for colouring and clusters for plot characters.
  scoreplot(plsModJ, col = gem$symbolicDesign[['MS']],
                  pch = 20+as.numeric(gem$symbolicDesign[['group']]))
  loadingplot(plsModJ, scatter=TRUE) # scatter=TRUE for scatter plot