core_cpp: Maximization of Cross-Validateed Accuracy Methods

Description

This function performs the maximization of cross-validated accuracy by an iterative process

Usage

core_cpp(x, 
         xTdata=NULL,
         clbest, 
         Tcycle=20, 
         FUN=c("PLS-DA","KNN"), 
         fpar=2, 
         constrain=NULL, 
         fix=NULL, 
         shake=FALSE)

Value

The function returns a list with 3 items:

clbest: a classification vector with a maximized cross-validated accuracy.
accbest: the maximum cross-validated accuracy achieved.
vect_acc: a vector of all cross-validated accuracies obtained.
vect_proj: a prediction of samples in xTdata matrix using the vector clbest. This output is present only if xTdata is not NULL.

Arguments

x: a matrix.
xTdata: a matrix for projections. This matrix contains samples that are not used for the maximization of the cross-validated accuracy. Their classification is obtained by predicting samples on the basis of the final classification vector.
clbest: a vector to optimize.
Tcycle: number of iterative cycles that leads to the maximization of cross-validated accuracy.
FUN: classifier to be consider. Choices are "KNN" and "PLS-DA".
fpar: parameters of the classifier. If the classifier is KNN, fpar represents the number of neighbours. If the classifier is PLS-DA, fpar represents the number of components.
constrain: a vector of nrow(data) elements. Supervised constraints can be imposed by linking some samples in such a way that if one of them is changed, all other linked samples change in the same way (i.e., they are forced to belong to the same class) during the maximization of the cross-validation accuracy procedure. Samples with the same identifying constrain will be forced to stay together.
fix: a vector of nrow(data) elements. The values of this vector must be TRUE or FALSE. By default all elements are FALSE. Samples with the TRUE fix value will not change the class label defined in W during the maximization of the cross-validation accuracy procedure. For more information refer to Cacciatore, et al. (2014).
shake: if shake = FALSE the cross-validated accuracy is computed with the class defined in W, before the maximization of the cross-validation accuracy procedure.

Author

Stefano Cacciatore and Leonardo Tenori

References

Cacciatore S, Luchinat C, Tenori L
Knowledge discovery by accuracy maximization.
Proc Natl Acad Sci U S A 2014;111(14):5117-22. doi: 10.1073/pnas.1220873111. Link

Cacciatore S, Tenori L, Luchinat C, Bennett PR, MacIntyre DA
KODAMA: an updated R package for knowledge discovery and data mining.
Bioinformatics 2017;33(4):621-623. doi: 10.1093/bioinformatics/btw705. Link

Examples

Run this code

# Here, the famous (Fisher's or Anderson's) iris data set was loaded
data(iris)
u=as.matrix(iris[,-5])
s=sample(1:150,150,TRUE)

# The maximization of the accuracy of the vector s is performed
results=core_cpp(u, clbest=s,fpar = 5)


print(as.numeric(results$clbest))

Run the code above in your browser using DataLab