plsmod-package: parsnip methods for partial least squares (PLS)

Description

plsmod offers a function to fit ordinary, sparse, and discriminant analysis PLS models.

Arguments

Examples

For regression, let’s use the Tecator data in the modeldata package:

library(tidymodels)
library(plsmod)
tidymodels_prefer()
theme_set(theme_bw())
data(meats, package = "modeldata")

Note that using tidymodels_prefer() will resulting getting parsnip::pls() instead of mixOmics::pls() when simply running pls().

Although plsmod can fit multivariate models, we’ll concentration on a univariate model that predicts the percentage of protein in the samples.

meats <- meats %>% select(-water, -fat)

We define a sparse PLS model by setting the predictor_prop argument to a value less than one. This allows the model fitting process to set certain loadings to zero via regularization.

sparse_pls_spec <- 
  pls(num_comp = 10, predictor_prop = 1/3) %>% 
  set_engine("mixOmics") %>% 
  set_mode("regression")

The model is fit either with a formula or by passing the predictors and outcomes separately:

form_fit <- 
  sparse_pls_spec %>% 
  fit(protein ~ ., data = meats)
form_fit

## parsnip model object
## 
## 
## Call:
##  mixOmics::spls(X = x, Y = y, ncomp = ncomp, keepX = keepX) 
## 
##  sPLS with a 'regression' mode with 10 sPLS components. 
##  You entered data X of dimensions: 215 100 
##  You entered data Y of dimensions: 215 1 
## 
##  Selection of [34] [34] [34] [34] [34] [34] [34] [34] [34] [34] variables on each of the sPLS components on the X data set. 
##  Selection of [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] variables on each of the sPLS components on the Y data set. 
## 
##  Main numerical outputs: 
##  -------------------- 
##  loading vectors: see object$loadings 
##  variates: see object$variates 
##  variable names: see object$names 
## 
##  Functions to visualise samples: 
##  -------------------- 
##  plotIndiv, plotArrow 
## 
##  Functions to visualise variables: 
##  -------------------- 
##  plotVar, plotLoadings, network, cim

# or 
sparse_pls_spec %>% 
  fit_xy(x = meats %>% select(-protein), y = meats$protein)

## parsnip model object
## 
## 
## Call:
##  mixOmics::spls(X = x, Y = y, ncomp = ncomp, keepX = keepX) 
## 
##  sPLS with a 'regression' mode with 10 sPLS components. 
##  You entered data X of dimensions: 215 100 
##  You entered data Y of dimensions: 215 1 
## 
##  Selection of [34] [34] [34] [34] [34] [34] [34] [34] [34] [34] variables on each of the sPLS components on the X data set. 
##  Selection of [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] variables on each of the sPLS components on the Y data set. 
## 
##  Main numerical outputs: 
##  -------------------- 
##  loading vectors: see object$loadings 
##  variates: see object$variates 
##  variable names: see object$names 
## 
##  Functions to visualise samples: 
##  -------------------- 
##  plotIndiv, plotArrow 
## 
##  Functions to visualise variables: 
##  -------------------- 
##  plotVar, plotLoadings, network, cim

The pls() function can also be used with categorical outcomes.

Author

Maintainer: Max Kuhn max@rstudio.com (ORCID)

Other contributors:

RStudio [copyright holder]

Details

The model function works with the tidymodels infrastructure so that the model can be resampled, tuned, tided, etc.

Description

Arguments

Examples

Author

Details

See Also