sda (version 1.3.7)

# sda: Shrinkage Discriminant Analysis 2: Training Step

## Description

`sda` trains a LDA or DDA classifier using James-Stein-type shrinkage estimation.

## Usage

`sda(Xtrain, L, lambda, lambda.var, lambda.freqs, diagonal=FALSE, verbose=TRUE)`

## Arguments

Xtrain
A matrix containing the training data set. Note that the rows correspond to observations and the columns to variables.
L
A factor with the class labels of the training samples.
lambda
Shrinkage intensity for the correlation matrix. If not specified it is estimated from the data. `lambda=0` implies no shrinkage and `lambda=1` complete shrinkage.
lambda.var
Shrinkage intensity for the variances. If not specified it is estimated from the data. `lambda.var=0` implies no shrinkage and `lambda.var=1` complete shrinkage.
lambda.freqs
Shrinkage intensity for the frequencies. If not specified it is estimated from the data. `lambda.freqs=0` implies no shrinkage (i.e. empirical frequencies) and `lambda.freqs=1` complete shrinkage (i.e. uniform frequencies).
diagonal
Chooses between LDA (default, `diagonal=FALSE`) and DDA (`diagonal=TRUE`).
verbose
Print out some info while computing.

## Value

`sda` trains the classifier and returns an `sda` object with the following components needed for the subsequent prediction:
regularization
a vector containing the three estimated shrinkage intensities,
freqs
the estimated class frequencies,
alpha
vector containing the intercepts used for prediction,
beta
matrix containing the coefficients used for prediction.

## Details

In order to train the LDA or DDA classifier, three separate shrinkage estimators are employed:

• class frequencies: the estimator `freqs.shrink` from Hausser and Strimmer (2008),
• variances: the estimator `var.shrink` from Opgen-Rhein and Strimmer (2007),

• correlations: the estimator `cor.shrink` from Sch\"afer and Strimmer (2005).
• Note that the three corresponding regularization parameters are obtained analytically without resorting to computer intensive resampling.

## References

Ahdesm\"aki, A., and K. Strimmer. 2010. Feature selection in omics prediction problems using cat scores and false non-discovery rate control. Ann. Appl. Stat. 4: 503-519. Preprint available from http://arxiv.org/abs/0903.2003.

`predict.sda`, `sda.ranking`, `freqs.shrink`, `var.shrink`, `invcor.shrink`.

## Examples

```# load sda library
library("sda")

##########################
# training and test data #
##########################

# data set containing the SRBCT samples
get.srbct = function()
{
data(khan2001)
idx = which( khan2001\$y == "non-SRBCT" )
x = khan2001\$x[-idx,]
y = factor(khan2001\$y[-idx])
descr = khan2001\$descr[-idx]

list(x=x, y=y, descr=descr)
}
srbct = get.srbct()

# training data
Xtrain = srbct\$x[1:63,]
Ytrain = srbct\$y[1:63]
Xtest = srbct\$x[64:83,]
Ytest = srbct\$y[64:83]

###################################################
# classification with correlation (shrinkage LDA) #
###################################################

sda.fit = sda(Xtrain, Ytrain)
ynew = predict(sda.fit, Xtest)\$class # using all 2308 features
sum(ynew != Ytest)

###########################################################
# classification with diagonal covariance (shrinkage DDA) #
###########################################################

sda.fit = sda(Xtrain, Ytrain, diagonal=TRUE)
ynew = predict(sda.fit, Xtest)\$class # using all 2308 features
sum(ynew != Ytest)

#################################################################
# for complete example scripts illustrating classification with #
# feature selection visit http://strimmerlab.org/software/sda/  #
#################################################################
```