The function computes various performance weasures for binary classification.
perfMeasures(pred, pred.group, truth, namePos)numeric vector in $[0,1]$ with predictions (probability to belong to positive group).
vector or factor including the predicted group. If missing,
pred.group is computed from pred, where pred >= 0.5 is
classified as positive.
true grouping vector or factor.
value representing the positive group.
data.frame with names of the performance measures and their respective
values.
The function computes various performance measures. The measures are: accuracy (ACC), probabiliy of correct classification (PCC), probability of missclassification (PMC), error rate, sensitivity, specificity, prevalence, balanced accuracy (BACC), informedness, Youden's J statistic, positive predictive value (PPV), negative predictive value (NPV), markedness, F1 score, Matthews' correlation coefficient (MCC), proportion of positive predictions, expected accuracy, Cohen's kappa coefficient, area under the ROC curve (AUC), Gini index, Brier score, positive Brier score, negative Brier score, and balanced Brier score.
G.W. Brier (1950). Verification of forecasts expressed in terms of probability. Mon. Wea. Rev. 78, 1???3.
K.H. Brodersen, C.S. Ong, K.E. Stephan, J.M. Buhmann (2010). The balanced accuracy and its posterior distribution. In Pattern Recognition (ICPR), 20th International Conference on, 3121-3124 (IEEE, 2010).
J.A. Cohen (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20, 3746.
T. Fawcett (2006). An introduction to ROC analysis. Pattern Recognition Letters 27, 861???874.
T.A. Gerds, T. Cai, M. Schumacher (2008). The performance of risk prediction models. Biom J 50, 457-479.
D. Hand, R. Till (2001). A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning 45, 171-186.
J. Hernandez-Orallo, P.A. Flach, C. Ferri (2011). Brier curves: a new cost- based visualisation of classifier performance. In L. Getoor and T. Scheffer (eds.) Proceedings of the 28th International Conference on Machine Learning (ICML-11), 585???592 (ACM, New York, NY, USA).
J. Hernandez-Orallo, P.A. Flach, C. Ferri (2012). A unified view of performance metrics: Translating threshold choice into expected classification loss. J. Mach. Learn. Res. 13, 2813-2869.
B.W. Matthews (1975). Comparison of the predicted and observed secondary structure of t4 phage lysozyme. Biochimica et Biophysica Acta (BBA) - Protein Structure 405, 442-451.
D.M. Powers (2011). Evaluation: From Precision, Recall and F-Factor to ROC, Informedness, Markedness and Correlation. Journal of Machine Learning Technologies 1, 37-63.
N.A. Smits (2010). A note on Youden's J and its cost ratio. BMC Medical Research Methodology 10, 89.
B. Wallace, I. Dahabreh (2012). Class probability estimates are unreliable for imbalanced data (and how to fix them). In Data Mining (ICDM), IEEE 12th International Conference on, 695-04.
J.W. Youden (1950). Index for rating diagnostic tests. Cancer 3, 32-35.
# NOT RUN {
## example from dataset infert
fit <- glm(case ~ spontaneous+induced, data = infert, family = binomial())
pred <- predict(fit, type = "response")
## with group numbers
perfMeasures(pred, truth = infert$case, namePos = 1)
## with group names
my.case <- factor(infert$case, labels = c("control", "case"))
perfMeasures(pred, truth = my.case, namePos = "case")
# }
Run the code above in your browser using DataLab