qwraps2 (version 0.4.2)

confusion_matrix: Confusion Matrices (Contingency Tables)

Description

Construction of confusion matrices, accuracy, sensitivity, specificity, confidence intervals (Wilson's method and (optional bootstrapping)).

Usage

confusion_matrix(x, ...)

# S3 method for default confusion_matrix( x, y, positive, boot = FALSE, boot_samples = 1000L, alpha = 0.05, ... )

# S3 method for formula confusion_matrix( formula, data = parent.frame(), positive, boot = FALSE, boot_samples = 1000L, alpha = 0.05, ... )

is.confusion_matrix(x)

# S3 method for confusion_matrix print(x, ...)

Arguments

x

prediction condition vector, a two level factor variable or a variable that can be converted to one.

...

not currently used

y

True Condition vector with the same possible values as x.

positive

the level of x and y which is the positive outcome. If missing the first level of factor(y) will be used as the positive level.

boot

boolean, should bootstrapped confidence intervals for the sensitivity and specificity be computed? Defaults to FALSE.

boot_samples

number of bootstrapping sample to generate, defaults to 1000L. Ignored if boot == FALSE.

alpha

100(1-alpha) sensitivity. Ignored if boot == FALSE.

formula

column (known) ~ row (test) for building the confusion matrix

data

environment containing the variables listed in the formula

Value

The sensitivity and specificity functions return numeric values. confusion_matrix returns a list with elements:

  • tab the confusion matrix,

  • stats a matrix of summary statistics and confidence intervals.

Details

Sensitivity and Specificity: For the sensitivity and specificity function we expect the 2-by-2 confusion matrix (contingency table) to be of the form:

True Condition
+ -
Predicted Condition + TP FP
Predicted Condition - FN TN

where

  • FN: False Negative, and

  • FP: False Positive,

  • TN: True Negative,

  • TP: True Positive.

Recall:

  • sensitivity = TP / (TP + FN)

  • specificity = TN / (TN + FP)

  • positive predictive value (PPV) = TP / (TP + FP)

  • negative predictive value (NPV) = TN / (TN + FN)

Examples

Run this code
# NOT RUN {
################################################################################
## Example 1
test  <- c(rep(1, 53), rep(0, 47))
truth <- c(rep(1, 20), rep(0, 33), rep(1, 10), rep(0, 37))
con_mat <- confusion_matrix(test, truth, positive = "1")
str(con_mat)
con_mat

################################################################################
## Example 2: based on an example from the wikipedia page:
# https://en.wikipedia.org/wiki/Confusion_matrix

animals <-
  data.frame(Predicted = c(rep("Cat",    5 + 2 +  0),
                           rep("Dog",    3 + 3 +  2),
                           rep("Rabbit", 0 + 1 + 11)),
             Actual    = c(rep(c("Cat", "Dog", "Rabbit"), times = c(5, 2,  0)),
                           rep(c("Cat", "Dog", "Rabbit"), times = c(3, 3,  2)),
                           rep(c("Cat", "Dog", "Rabbit"), times = c(0, 1, 11))),
             stringsAsFactors = FALSE)

table(animals)

cats <- apply(animals, 1:2, function(x) ifelse(x == "Cat", "Cat", "Non-Cat"))

# Default calls, note the difference based on what is set as the 'positive'
# value.
confusion_matrix(cats[, "Predicted"], cats[, "Actual"], positive = "Cat")
confusion_matrix(cats[, "Predicted"], cats[, "Actual"], positive = "Non-Cat")

# Using a Formula
confusion_matrix(I(Actual == "Cat") ~ I(Predicted == "Cat"),
                 data = as.data.frame(animals),
                 positive = "TRUE")

################################################################################
## Example 3
russell <-
  data.frame(Pred  = c(rep(0, 2295), rep(0, 118), rep(1, 1529), rep(1, 229)),
             Truth = c(rep(0, 2295), rep(1, 118), rep(0, 1529), rep(1, 229)))

# The values for Sensitivity, Specificity, PPV, and NPV are dependent on the
# "positive" level.  By default, the first level of y is used.
confusion_matrix(x = russell$Pred, y = russell$Truth, positive = "0")
confusion_matrix(x = russell$Pred, y = russell$Truth, positive = "1")

# }

Run the code above in your browser using DataLab