skill_confusionMatrix: Confusion Matrix Statistics

Description

Measurements of categorical forecast accuracy have a long history in weather forecasting. The standard approach involves making binary classifications (detected/not-detected) of predicted and observed data and combining them in a binary contingency table known as a confusion matrix.

This function creates a confusion matrix from predicted and observed values and calculates a wide range of common statistics including:

TP (true postive)
FP (false postive) (type I error)
FN (false negative) (type II error)
TN (true negative)
TPRate (true positive rate) = sensitivity = recall = TP / (TP + FN)
FPRate (false positive rate) = FP / (FP + TN)
FNRate (false negative rate) = FN / (TP + FN)
TNRate (true negative rate) = specificity = TN / (FP + TN)
accuracy = proportionCorrect = (TP + TN) / total
errorRate = 1 - accuracy = (FP + FN) / total
falseAlarmRatio = PPV (positive predictive value) = precision = TP / (TP + FP)
FDR (false discovery rate) = FP / (TP + FP)
NPV (negative predictive value) = TN / (TN + FN)
FOR (false omission rate) = FN / (TN + FN)
f1_score = (2 * TP) / (2 * TP + FP + FN)
detectionRate = TP / total
baseRate = detectionPrevalence = (TP + FN) / total
probForecastOccurance = prevalence = (TP + FP) / total
balancedAccuracy = (TPRate + TNRate) / 2
expectedAccuracy = (((TP + FP) * (TP + FN) / total) + ((FP + TN) * sum(FN + TN) / total )) / total
heidkeSkill = kappa = (accuracy - expectedAccuracy) / (1 - expectedAccuracy)
bias = (TP + FP) / (TP + FN)
hitRate = TP / (TP + FN)
falseAlarmRate = FP / (FP + TN)
pierceSkill = ((TP * TN) - (FP * FN)) / ((FP + TN) * (TP + FN))
criticalSuccess = TP / (TP + FP + FN)
oddsRatioSkill = yulesQ = ((TP * TN) - (FP * FN)) / ((TP * TN) + (FP * FN))

Usage

skill_confusionMatrix(predicted, observed, FPCost = 1, FNCost = 1,
  lightweight = FALSE)

Arguments

predicted

logical vector of predicted values

observed

logical vector of observed values

FPCost

cost associated with false positives (type I error)

FNCost

cost associated with false negatives (type II error)

lightweight

flag specifying creation of a return list without derived metrics

Value

List containing a table of confusion matrix values and a suite of derived metrics.

References

Simple Guide to Confusion Matrix Terminology

Examples

Run this code

# NOT RUN {
predicted <- sample(c(TRUE,FALSE), 1000, replace=TRUE, prob=c(0.3,0.7))
observed <- sample(c(TRUE,FALSE), 1000, replace=TRUE, prob=c(0.3,0.7))
cm <- skill_confusionMatrix(predicted, observed)
print(cm)

# }