rda.cv: RDA Cross Validation Function

Description

A function that does RDA cross-validation analysis on the training data set.

Usage

rda.cv(fit, x, y, prior, alpha, delta, nfold=min(table(y), 10),
       folds=balanced.folds(y), trace=FALSE)

Arguments

fit

An rda fit object obtained from the rda function.

The training data set as used in the rda function.

The class labels of the training samples (columns) in "x" as used in rda function.

prior

A numerical vector that gives the prior proportion of each class. Its length should be equal to the number of classes. By default, the function uses the one coming along with the fit object unless users want to specify some other prior vector

alpha

A numerical vector of the regularization values for alpha. By default, the function uses the one coming along with the fit object unless users want to do cross-validation based on some other values of alpha.

delta

A numerical vector of the threshold values for delta. By default, the function uses the one coming along with the fit object unless users want to do cross-validation based on some other values of delta.

nfold

An integer number to specify the number of folds in the cross-validation analysis. This option is overwritten when the folds option is specified at the same time.

folds

A list that provides the folds used in the cross-validation analysis. Each component of the list is an integer vector of the sample indices. See examples below for more details.

trace

A logical flag indicating whether the intermediate steps should be printed.

Value

The rda.cv function will return an object of class rdacv with the following list of components:
alphaThe vector of the regularization values for alpha used in the cross-validation.
deltaThe vector of the threshold values for delta used in the cross-validation.
priorThe vector of the prior proportion of each class used in the cross-validation.
nfoldThe number of folds used in the cross-validation.
foldsThe folds used in the cross-validation.
yhat.newThe 3-dim array of the predicted class labels of the training samples for each combination (alpha, delta). The first index corresponds to the alpha values while the second index corresponds to the delta values. The third index is the predicted class labels for the corresponding samples.
errThe training error matrix from cross-validation. The rows correspond to the alpha values while the columns correspond to the delta values. It is automatically generated by the function.
cv.errThe test error (or cross-validation error) matrix. The rows correspond to the alpha values while the columns correspond to the delta values.
ngeneThe matrix of the number of shrunken genes. The rows correspond to the alpha values while the columns correspond to the delta values. Note: the number of shrunken genes is based on the average result from cross-validation.
regThe type of regularization used in cross-validation.
nThe sample size of the training data set.

Details

rda.cv does the RDA-based cross-validation on the training data set.

References

Guo, Y. et al. (2004) Regularized Discriminant Analysis and Its Application in Microarrays, Technical Report, Department of Statistics, Stanford University.

Examples

Run this code

data(colon)
colon.x <- t(colon.x)
fit <- rda(colon.x, colon.y)
fit.cv <- rda.cv(fit, x=colon.x, y=colon.y)

## to use the customized folds in cross-validation,
## for example, 6-fold with 11, 11, 10, 10, 10, 10 samples 
## in the respective folds, you can do the follows:
index <- sample(1:62, 62)
folds <- list()
folds[[1]] <- index[1:11]
folds[[2]] <- index[12:22]
folds[[3]] <- index[23:32]
folds[[4]] <- index[33:42]
folds[[5]] <- index[43:52]
folds[[6]] <- index[53:62]
fit.cv <- rda.cv(fit, colon.x, colon.y, folds=folds)

Run the code above in your browser using DataLab