Slda: Shrunken Linear Discriminant Analysis.

Description

Slda finds the coefficients of a linear discriminant rule based on Fisher and Sun's (2011) estimate and generalizations of Ledoit and Wolf's (2004) optimal shrunken covariance matrix.

Usage

## S3 method for class 'default':
Slda(data, grouping, prior = "proportions", StddzData=TRUE, VSelfunct = SelectV,
Trgt=c("CnstDiag","Idntty","VarDiag"), minp=20, ldafun=c("canonical","classification"), ...)
## S3 method for class 'data.frame':
Slda(data, \dots)

Arguments

data

Matrix or data frame of observations.

grouping

Factor specifying the class for each observation.

prior

The prior probabilities of class membership. If unspecified, the class proportions for the training set are used. If present, the probabilities should be specified in the order of the factor levels.

StddzData

A boolean flag indicating whether the data should be standardized first (default) or used in their original scales.

VSelfunct

Variable selection function. Either the string none (no selection is to be performed) or a function that takes data and grouping as its first two arguments and returns a list with two components: (i)

Trgt

A string code with the target type used by the shrunken estimator. The alternatives are CnstDiag for a Ledoit-Wolf constant diagonal target, Idntty for a p-dimensional identity, and VarDiag for a diagonal

minp

Minimum number of variables required for the estimation of the target intensity to be considered reliable. If the dimension of Sigma is below pmin, no shrunken estimate is computed and the original sample covariance is employed.

ldafun

Type of discriminant linear functions computed. The alternatives are canonical for maximum-discrimination canonical functions and classification for direct-classification functions.

...

Further arguments passed to or from other methods.

Value

If algument ldafun is set to canonical an object of class Scanlda, which extends class canldaRes, with the following components:
priorThe prior probabilities used.
meansThe class means.
scalingA matrix which transforms observations to discriminant functions, normalized so that the within groups covariance matrix is spherical.
svdThe singular values, which give the ratio of the between- and within-group standard deviations on the linear discriminant variables. Their squares are the canonical F-statistics.
vkptA vector with the indices of the variables kept in the discriminant rule if the number of variables kept is less than ncol(data). NULL otherwise.
nvkptThe number of variables kept in the discriminant rule if this number is less thanncol(data). NULL otherwise.
SSigAn object of class ShrnkMat with a compact representation of the within groups covariance matrix. ShrnkMat objects have specialized methods for matrix inversion, multiplication, and element-wise arithmetic operations.
SSigInvAn object of class ShrnkMatInv with a compact representation of the within groups precision (inverse covariance) matrix. ShrnkMatInv objects have specialized methods for matrix inversion, multiplication, and element-wise arithmetic operations.
NThe number of observations used.
callThe (matched) function call.
If algument ldafun is set to classification an object of class Scllda, which extends class clldaRes, with the following components:
priorThe prior probabilities used.
meansThe class means.
coefA matrix with the coefficients of the k-1 classification functions.
cnstA vector with the thresholds (2nd members of linear classification rules) used in classification rules that assume equal priors.
vkptA vector with the indices of the variables kept in the discriminant rule if the number of variables kept is less than ncol(data). NULL otherwise.
nvkptThe number of variables kept in the discriminant rule if this number is less thanncol(data). NULL, otherwise.
SSigAn object of class ShrnkMat with a compact representation of the within groups covariance matrix. ShrnkMat objects have specialized methods for matrix inversion, multiplication, and element-wise arithmetic operations.
SSigInvAn object of class ShrnkMatInv with a compact representation of the within groups precision (inverse covariance) matrix. ShrnkMatInv objects have specialized methods for matrix inversion, multiplication, and element-wise arithmetic operations.
NThe number of observations used.
callThe (matched) function call.

References

Ledoit, O. and Wolf, M. (2004) A well-conditioned estimator for large-dimensional covariance matrices., Journal of Multivariate Analysis, 88 (2), 365-411.

Fisher, T.J. and Sun, X. (2011) Improved Stein-type shrinkage estimators for the high-dimensional multivariate normal covariance matrix, Computational Statistics and Data Analysis, 55 (1), 1909-1918.

Pedro Duarte Silva, A. (2011) Two Group Classification with High-Dimensional Correlated Data: A Factor Model Approach, Computational Statistics and Data Analysis, 55 (1), 2975-2990.

Examples

Run this code

# train classifier on Alon's Colon Cancer Data set after a logarithmic transformation 
# (selecting genes by the Expanded HC scheme). 

ldarule <- Slda(log10(AlonDS[,-1]),AlonDS$grouping)     

# show classification rule

print(ldarule)

# get in-sample classification results

predict(ldarule,log10(AlonDS[,-1]),grpcodes=levels(AlonDS$grouping))$class           	       

# compare classifications with true assignments

cat("Original classes:
")
print(AlonDS[,1])             		 


# Estimate error rates by four-fold cross-validation.
# (Note: In cross-validation analysis it is recommended to set the argument 
# 'ldafun' to "classification", in order to speed up computations by avoiding 
# unecessary eigen-decompositions) 

CrosValRes <- DACrossVal(log10(AlonDS[,-1]),AlonDS$grouping,TrainAlg=Slda,ldafun="classification",
kfold=4,CVrep=1)
summary(CrosValRes[,,"Clerr"])

Run the code above in your browser using DataLab