PCA.CA.KNN.CV: Cross-Validation with PCA-CA-kNN.

Description

This is function performs a 10-fold cross validation on a given data set using k nearest neighbors (kNN) classifier. The kNN classifier is performed on the score of the Principal Component Analysis (PCA) and Canonical Analysis (CA). The output is a vector of predicted labels.

Usage

PCA.CA.KNN.CV(x,cl,constrain,kn=10,variance=0.9)

Arguments

a matrix.

a classification vector.

constrain

a vector of nrow(data) elements. Sample with the same identificative constrain will be split in the training set or in the test test of cross-validation together.

the number of nearest neighbors to consider.

variance

the number of the pricipal component of the PCA is selected on the base of the amount of selected variance. (by default = 0.9).

Value

The function returns a vector of predicted labels.

Details

PCA-CA-kNN classifier was used successfully in Wallner-Liebmann, et al. (2012) and Saccenti, et al. (2012) to classify metabolomic data.

References

Wallner-Liebmann S, Gralka E, Tenori L et al. The impact of free or standardized lifestyle and urine sampling protocol on metabolome recognition accuracy. Genes Nutr 2015;10:441

Saccenti E, Tenori L, Verbruggen P, et al. Of monkeys and men: a metabolomic analysis of static and dynamic urinary metabolic phenotypes in two species. PLoS One 2014;9(9):e106077.

Examples

Run this code

data(MetRef);
u=MetRef$data;
u=u[,-which(colSums(u)==0)]
u=scaling(u)$newXtrain
class=as.factor(unlist(MetRef$donor))
results=PCA.CA.KNN.CV(u,class,1:length(class))
levels(results)=levels(class)
table(results,class)

Run the code above in your browser using DataLab