sda: Shrinkage Discriminant Analysis

Description

rda trains a LDA or DDA classifier using Stein-type shrinkage estimation. predict.sda performs the corresponding class prediction.

Usage

sda(Xtrain, L, diagonal=FALSE, verbose=TRUE)
## S3 method for class 'sda':
predict(object, Xtest, ...)

Arguments

Xtrain

A matrix containing the training data set. Note that the rows are sample observations and the columns are variables.

A factor with the class labels of the training samples.

diagonal

Chooses between LDA (default, diagonal=FALSE) and DDA (diagonal=TRUE).

verbose

Report shrinkage intensities.

object

An sda fit object obtained from the function sda.

Xtest

A matrix containing the test data set.

...

Additional arguments for generic predict.

Value

sda trains the classifier and returns an sda object with the following components:
regularizationa vector containing the three estimated shrinkage intensities,
priorthe estimated class frequencies,
meansa matrix containing the group means (centroids), and
invcovthe inverse of the estimated pooled covariance matrix (this is a matrix in the LDA and a vector in the DDA case).
predict.sda predicts class probabilities for each test sample and returns a list with two components:
yhata factor with the the most likely class assignment, and
probsa matrix containing the class probabilities for each test sample.

Details

In order to train the LDA or DDA classifier, three separate shrinkage estimators are employed:

class frequencies{ the estimator freqs.shrink from Hausser and Strimmer (2008),} variances{the estimator var.shrink from Opgen-Rhein and Strimmer (2007), }

correlations{the estimator invcor.shrink from Sch"afer and Strimmer (2005). }

These estimates are plugged into the LDA and DDA discriminant function for prediction. Note that the three corresponding regularization parameters are obtained analytically without resorting to computer intensive resampling.

This approach is particularly suited for high-dimensional classification.

Examples

Run this code

library("sda")

## prepare data set
data(iris) # good old iris data
X = as.matrix(iris[,1:4])
Y = iris[,5]

# divide into test and traing data set
tr.index = sample(1:length(Y), 2/3*length(Y))
train.x = X[tr.index,]                     
train.y = Y[tr.index]
test.x = X[-tr.index,] 
test.y = Y[-tr.index]


## shrinkage LDA
sda.fit = sda(Xtrain=train.x, L=train.y)
sda.fit

predict(sda.fit, test.x)

ynew = predict(sda.fit, test.x)$yhat
sum(ynew != test.y)


## shrinkage DDA
sda.fit = sda(Xtrain=train.x, L=train.y, diagonal=TRUE)
sda.fit

predict(sda.fit, test.x)

ynew = predict(sda.fit, test.x)$yhat
sum(ynew != test.y)

Run the code above in your browser using DataLab