Learn R Programming

enpls (version 5.6)

enspls.ad: Ensemble Sparse Partial Least Squares for Model Applicability Domain Evaluation

Description

Model applicability domain evaluation with ensemble sparse partial least squares.

Usage

enspls.ad(x, y, xtest, ytest, maxcomp = 5L, cvfolds = 5L, alpha = seq(0.2, 0.8, 0.2), space = c("sample", "variable"), method = c("mc", "boot"), reptimes = 500L, ratio = 0.8, parallel = 1L)

Arguments

x
Predictor matrix of the training set.
y
Response vector of the training set.
xtest
List, with the i-th component being the i-th test set's predictor matrix (see example code below).
ytest
List, with the i-th component being the i-th test set's response vector (see example code below).
maxcomp
Maximum number of components included within each model. If not specified, will use 5 by default.
cvfolds
Number of cross-validation folds used in each model for automatic parameter selection, default is 5.
alpha
Parameter (grid) controlling sparsity of the model. If not specified, default is seq(0.2, 0.8, 0.2).
space
Space in which to apply the resampling method. Can be the sample space ("sample") or the variable space ("variable").
method
Resampling method. "mc" (Monte-Carlo resampling) or "boot" (bootstrapping). Default is "mc".
reptimes
Number of models to build with Monte-Carlo resampling or bootstrapping.
ratio
Sampling ratio used when method = "mc".
parallel
Integer. Number of CPU cores to use. Default is 1 (not parallelized).

Value

A list containing:
  • tr.error.mean - absolute mean prediction error for training set
  • tr.error.median - absolute median prediction error for training set
  • tr.error.sd - prediction error sd for training set
  • tr.error.matrix - raw prediction error matrix for training set
  • te.error.mean - list of absolute mean prediction error for test set(s)
  • te.error.median - list of absolute median prediction error for test set(s)
  • te.error.sd - list of prediction error sd for test set(s)
  • te.error.matrix - list of raw prediction error matrix for test set(s)

Examples

Run this code
data("logd1k")
# remove low variance variables
x = logd1k$x[, -c(17, 52, 59)]
y = logd1k$y

# training set
x.tr = x[1:300, ]
y.tr = y[1:300]

# two test sets
x.te = list("test.1" = x[301:400, ],
            "test.2" = x[401:500, ])
y.te = list("test.1" = y[301:400],
            "test.2" = y[401:500])

set.seed(42)
ad = enspls.ad(x.tr, y.tr, x.te, y.te,
               maxcomp = 3, alpha = c(0.3, 0.6, 0.9),
               space = "variable", method = "mc",
               ratio = 0.8, reptimes = 10)
print(ad)
plot(ad)
# The interactive plot requires a HTML viewer
## Not run: 
# plot(ad, type = "interactive")## End(Not run)

Run the code above in your browser using DataLab