plsda
Partial Least Squares and Sparse Partial Least Squares Discriminant Analysis
plsda
is used to fit standard PLS models for classification while splsda
performs sparse PLS that embeds feature selection and regularization for the same purpose.
- Keywords
- models
Usage
plsda(x, ...)## S3 method for class 'default':
plsda(x, y, ncomp = 2, probMethod = "softmax", prior = NULL, ...)
## S3 method for class 'plsda':
predict(object, newdata = NULL, ncomp = NULL, type = "class", ...)
splsda(x, ...)
## S3 method for class 'default':
splsda(x, y, probMethod = "softmax", prior = NULL, ...)
## S3 method for class 'splsda':
predict(object, newdata = NULL, type = "class", ...)
Arguments
- x
- a matrix or data frame of predictors
- y
- a factor or indicator matrix for the discrete outcome. If a matrix, the entries must be either 0 or 1 and rows must sum to one
- ncomp
- the number of components to include in the model. Predictions can be made for models with values less than
ncomp
. - probMethod
- either "softmax" or "Bayes" (see Details)
- prior
- a vector or prior probabilities for the classes (only used for
probeMethod = "Bayes"
- ...
- arguments to pass to
plsr
orspls
. Forsplsda
, this is the method for passing tuning parameters specifications (e.g.K
, - object
- an object produced by
plsda
- newdata
- a matrix or data frame of predictors
- type
- either
"class"
,"prob"
or"raw"
to produce the predicted class, class probabilities or the raw model scores, respectively.
Details
If a factor is supplied, the appropriate indicator matrix is created.
A multivariate PLS model is fit to the indicator matrix using the plsr
or spls
function.
Two prediciton methods can be used.
The softmax function transforms the model predictions to "probability-like" values (e.g. on [0, 1] and sum to 1). The class with the largest class probability is the predicted class.
Also, Bayes rule can be applied to the model predictions to form posterior probabilities. Here, the model predictions for the training set are used along with the training set outcomes to create conditional distributions for each class. When new samples are predicted, the raw model predictions are run through these conditional distributions to produce a posterior probability for each class (along with the prior). This process is repeated ncomp
times for every possible PLS model. The NaiveBayes
function is used with usekernel = TRUE
for the posterior probability calculations.
Value
- For
plsda
, an object of class "plsda" and "mvr". Forsplsda
, an object of classsplsda
.The predict methods produce either a vector, matrix or three-dimensional array, depending on the values of
type
ofncomp
. For example, specifying more than one value ofncomp
withtype = "class"
with produce a three dimensional array but the default specification would produce a factor vector.
See Also
Examples
data(mdrr)
set.seed(1)
inTrain <- sample(seq(along = mdrrClass), 450)
nzv <- nearZeroVar(mdrrDescr)
filteredDescr <- mdrrDescr[, -nzv]
training <- filteredDescr[inTrain,]
test <- filteredDescr[-inTrain,]
trainMDRR <- mdrrClass[inTrain]
testMDRR <- mdrrClass[-inTrain]
preProcValues <- preProcess(training)
trainDescr <- predict(preProcValues, training)
testDescr <- predict(preProcValues, test)
useBayes <- plsda(trainDescr, trainMDRR, ncomp = 5,
probMethod = "Bayes")
useSoftmax <- plsda(trainDescr, trainMDRR, ncomp = 5)
confusionMatrix(
predict(useBayes, testDescr),
testMDRR)
confusionMatrix(
predict(useSoftmax, testDescr),
testMDRR)
histogram(
~predict(useBayes, testDescr, type = "prob")[,"Active"]
| testMDRR, xlab = "Active Prob", xlim = c(-.1,1.1))
histogram(
~predict(useSoftmax, testDescr, type = "prob")[,"Active",]
| testMDRR, xlab = "Active Prob", xlim = c(-.1,1.1))
## different sized objects are returned
length(predict(useBayes, testDescr))
dim(predict(useBayes, testDescr, ncomp = 1:3))
dim(predict(useBayes, testDescr, type = "prob"))
dim(predict(useBayes, testDescr, type = "prob", ncomp = 1:3))
## using spls
splsFit <- splsda(trainDescr, trainMDRR,
K = 5, eta = .9,
probMethod = "Bayes")
confusionMatrix(
predict(splsFit, testDescr),
testMDRR)