scdaCMA: Shrunken Centroids Discriminant Analysis

Description

The nearest shrunken centroid classification algorithm is detailly described in Tibshirani et al. (2002). It is widely known under the name PAM (prediction analysis for microarrays), which can also be found in the package pamr. For S4 method information, see scdaCMA-methods.

Usage

scdaCMA(X, y, f, learnind, delta = 0.5, models=FALSE,...)

Arguments

Gene expression data. Can be one of the following:

Amatrix. Rows correspond to observations, columns to variables.
Adata.frame, whenfisnotmissing (s. below).
An object of classExpressionSet.

Class labels. Can be one of the following:

Anumericvector.
Afactor.
AcharacterifXis anExpressionSetthat specifies the phenotype variable.
missing, ifXis adata.frameand a proper formulafis provided.

WARNING: The class labels will be re-coded to range from 0 to K-1, where K is the total number of different classes in the learning set.

A two-sided formula, if X is a data.frame. The left part correspond to class labels, the right to variables.

learnind

An index vector specifying the observations that belong to the learning set. May be missing; in that case, the learning set consists of all observations and predictions are made on the learning set.

delta

The shrinkage intensity for the class centroids - a hyperparameter that must be tuned. The default 0.5 not necessarily produces good results.

models

a logical value indicating whether the model object shall be returned

...

Currently unused argument.

Value

An object of class cloutput.

References

Tibshirani, R., Hastie, T., Narasimhan, B., and Chu, G., (2003). Class prediction by nearest shrunken centroids with applications to DNA microarrays. Statistical Science, 18, 104-117

Examples

Run this code

### load Khan data
data(khan)
### extract class labels
khanY <- khan[,1]
### extract gene expression
khanX <- as.matrix(khan[,-1])
### select learningset
set.seed(111)
learnind <- sample(length(khanY), size=floor(2/3*length(khanY)))
### run Shrunken Centroids classfier, without tuning
scdaresult <- scdaCMA(X=khanX, y=khanY, learnind=learnind)
### show results
show(scdaresult)
ftable(scdaresult)
plot(scdaresult)

Run the code above in your browser using DataLab