msda
Performs K-fold cross validation for msda
and returns the best tuning parameter \(\lambda\) in the user-specified or automatically generated choices.
cv.msda(x, y, model = NULL, nfolds = 5, lambda = NULL,
lambda.opt = "min", ...)
Input matrix of predictors. x
is of dimension \(N \times p\); each row is an observation vector.
Class label. For K
class problems, y
takes values in \(\{1,\cdots,\code{K}\}\).
Method type. The model
argument can be one of 'binary'
, 'multi.original'
, 'multi.modified'
and the default is NULL. The function supports fitting DSDA and MSDA models by specifying method type. Without specification, the function will automatically choose one of the methods. If the response variable is binary, the function will fit a DSDA model. If the response variable is multi-class, the function will fit an original MSDA model for dimension \(p<=2000\) and a modified MSDA model for dimension \(p>2000\).
Number of folds. Default value is 5. Although nfolds
can be as large as the sample size (leave-one-out CV), it is not
recommended for large datasets. Smallest value allowable is nfolds=3
for multi.original
and multi.modified
.
User-specified lambda
sequence for cross validation. If not specified, the algorithm will generate a sequence of lambda
s based on all data and cross validate on the sequence.
The optimal criteria when multiple elements in lambda
return the same minimum classification error. "min
" will return the smallest lambda
with minimum cross validation error. "max
" will return the largest lambda
with the minimum cross validation error.
other arguments that can be passed to msda
.
An object of class cv.dsda
or cv.msda.original
or cv.msda.modified
is returned, which is a
list with the ingredients of the cross-validation fit.
The actual lambda
sequence used. The user specified sequence or automatically generated sequence could be truncated by constraints on dfmax
and pmax
.
The mean of cross validation errors for each lambda
.
The standard error of cross validaiton errors for each lambda
.
The lambda
with minimum cross validation error. If lambda.opt
is min
, then returns the smallest lambda
with minimum cross validation error. If lambda.opt
is max
, then returns the largest lambda
with minimum cross validation error.
The largest value of lambda
such that error is
within one standard error of the minimum. This arguement is only available for object cv.msda.original
and cv.msda.modified
.
A fitted cv.dsda
or cv.msda.original
or cv.msda.modified
object for the full data.
The function cv.msda
runs function msda
nfolds+1
times. The first one fits model on all data. If lambda
is specified, it will check if all lambda
satisfies the constraints of dfmax
and pmax
in msda
. If not, a lambda
sequence will be generated according to lambda.factor
in msda
. Then the rest nfolds
many replicates will fit model on nfolds-1
many folds data and predict on the omitted fold, repectively. Return the lambda
with minimum average cross validation error and the largest lambda
within one standard error of the minimum.
Similar as msda
, user can specify which method to use by inputing argument model
. Without specification, the function can automatically decide the method by number of classes and variables.
Mai, Q., Zou, H. and Yuan, M. (2012), "A direct approach to sparse discriminant analysis in ultra-high dimensions." Biometrica, 99, 29-42.
Mai, Q., Yang, Y., and Zou, H. (2017), "Multiclass sparse discriminant analysis." Statistica Sinica, in press.
# NOT RUN {
data(GDS1615)
x <- GDS1615$x
y <- GDS1615$y
obj.cv <- cv.msda(x=x, y=y, nfolds=5, lambda.opt="max")
lambda.min <- obj.cv$lambda.min
obj <- msda(x=x, y=y, lambda=lambda.min)
pred <- predict(obj,x)
# }
Run the code above in your browser using DataLab