splsda: Fit SPLSDA classification models

Description

Fit a SPLSDA classification model.

Usage

splsda( x, y, K, eta, kappa=0.5,
    classifier=c('lda','logistic'), scale.x=TRUE, ... )

Value

A splsda object is returned. print, predict, coef methods use this object.

Arguments

x: Matrix of predictors.
y: Vector of class indices.
K: Number of hidden components.
eta: Thresholding parameter. eta should be between 0 and 1.
kappa: Parameter to control the effect of the concavity of the objective function and the closeness of original and surrogate direction vectors. kappa is relevant only for multicategory classification. kappa should be between 0 and 0.5. Default is 0.5.
classifier: Classifier used in the second step of SPLSDA. Alternatives are "logistic" or "lda". Default is "lda".
scale.x: Scale predictors by dividing each predictor variable by its sample standard deviation?
...: Other parameters to be passed through to spls.

Author

Dongjun Chung and Sunduz Keles.

Details

The SPLSDA method is described in detail in Chung and Keles (2010). SPLSDA provides a two-stage approach for PLS-based classification with variable selection, by directly imposing sparsity on the dimension reduction step of PLS using sparse partial least squares (SPLS) proposed in Chun and Keles (2010). y is assumed to have numerical values, 0, 1, ..., G, where G is the number of classes subtracted by one. The option classifier refers to the classifier used in the second step of SPLSDA and splsda utilizes algorithms offered by MASS and nnet packages for this purpose. If classifier="logistic", then either logistic regression or multinomial regression is used. Linear discriminant analysis (LDA) is used if classifier="lda". splsda also utilizes algorithms offered by the pls package for fitting spls. The user should install pls, MASS and nnet packages before using splsda functions.

References

Chung D and Keles S (2010), "Sparse partial least squares classification for high dimensional data", Statistical Applications in Genetics and Molecular Biology, Vol. 9, Article 17.

Chun H and Keles S (2010), "Sparse partial least squares for simultaneous dimension reduction and variable selection", Journal of the Royal Statistical Society - Series B, Vol. 72, pp. 3--25.

Examples

Run this code

data(prostate)
# SPLSDA with eta=0.8 & 3 hidden components
f <- splsda( prostate$x, prostate$y, K=3, eta=0.8, scale.x=FALSE )
print(f)
# Print out coefficients
coef.f <- coef(f)
coef.f[ coef.f!=0, ]

Run the code above in your browser using DataLab