Learn R Programming

RFmarkerDetector (version 1.0.1)

aucMCV: AUC multiple cross-validation

Description

This function implements the AUCRF algorithm for identifying the variables (metabolites) most relevant for the classification task

Usage

aucMCV(data, seed = 1234, ref_level = levels(data[, 2])[1], auc_rank = "MDG", auc_ntree = 500, auc_nfolds = 5, auc_pdel = 0.2, auc_colour = "grey", auc_iterations = 5)

Arguments

data
a n x p dataframe used to execute the AUCRF algorithm and perform a repetead CV of the AUCRF process. The dependent variable must be a binary variable defined as a factor and codified as 0 for negatives (e.g controls) and 1 for positivies (e.g. cases)
seed
a numeric value to set the seed of R's random number generator
ref_level
the class assumed as reference for the binary classification
auc_rank
the importance measure provided by randomForest for ranking the variables. There are two options: MDG (default) and MDA
auc_ntree
the number of tree of each random forest model used
auc_nfolds
the number of folds in cross validation. By default a 5-fold cross validation is performed
auc_pdel
the fraction of variables to be removed at each step. If $auc_pdel = 0$, it will be removed only one variable at each step
auc_colour
the color chosen
auc_iterations
a numeric that represents the number of cross validation repetitions

Details

Exploting the AUCRF algorithm, the fuction allows to identify the best performing 'parsimonious' model in terms of OOB-AUC and the most relevant variables (metabolites) involved in the prediction task.

References

Calle ML, Urrea V, Boulesteix A-L, Malats N (2011) 'AUC-RF: A new strategy for genomic pro- filing with Random Forest'. Human Heredity

Examples

Run this code
## data(cachexiaData)
## aucMCV(cachexiaData, ref_level = 'control')

Run the code above in your browser using DataLab