Learn R Programming

SLmetrics (version 0.3-4)

cmatrix: Confusion Matrix

Description

A generic S3 function to compute the confusion matrix for a classification model. This function dispatches to S3 methods in cmatrix() and performs no input validation. If you supply NA values or vectors of unequal length (e.g. length(x) != length(y)), the underlying C++ code may trigger undefined behavior and crash your R session.

Defensive measures

Because cmatrix() operates on raw pointers, pointer-level faults (e.g. from NA or mismatched length) occur before any R-level error handling. Wrapping calls in try() or tryCatch() will not prevent R-session crashes.

To guard against this, wrap cmatrix() in a "safe" validator that checks for NA values and matching length, for example:

safe_cmatrix <- function(x, y, ...) {
  stopifnot(
    !anyNA(x), !anyNA(y),
    length(x) == length(y)
  )
  cmatrix(x, y, ...)
}

Apply the same pattern to any custom metric functions to ensure input sanity before calling the underlying C++ code.

The workhorse

cmatrix() is the main function for classification metrics with cmatrix S3 dispatch. These functions internally calls cmatrix(), so there is a signficant gain in computing the confusion matrix first, and then pass it onto the metrics. For example:

## Compute confusion matrix
confusion_matrix <- cmatrix(actual, predicted)

## Evaluate accuracy ## via S3 dispatching accuracy(confusion_matrix)

## Evaluate recall ## via S3 dispatching recall(confusion_matrix)

Usage

## Generic S3 method
## for Confusion Matrix
cmatrix(...)

## Generic S3 method ## for weighted Confusion Matrix weighted.cmatrix(...)

Value

A named \(k\) x \(k\) <matrix>

Arguments

...

Arguments passed on to cmatrix.factor, weighted.cmatrix.factor

actual,predicted

A pair of <integer> or <factor> vectors of length \(n\), and \(k\) levels.

w

A <double> vector of sample weights.

Dimensions

There is no robust defensive measure against misspecifying the confusion matrix. If the arguments are passed correctly, the resulting confusion matrix is on the form:

A (Predicted)B (Predicted)
A (Actual)ValueValue
B (Actual)ValueValue

References

James, Gareth, et al. An introduction to statistical learning. Vol. 112. No. 1. New York: springer, 2013.

Hastie, Trevor. "The elements of statistical learning: data mining, inference, and prediction." (2009).

Pedregosa, Fabian, et al. "Scikit-learn: Machine learning in Python." the Journal of machine Learning research 12 (2011): 2825-2830.

See Also

Other Classification: accuracy(), auc.pr.curve(), auc.roc.curve(), baccuracy(), brier.score(), ckappa(), cross.entropy(), dor(), fbeta(), fdr(), fer(), fmi(), fpr(), hammingloss(), jaccard(), logloss(), mcc(), nlr(), npv(), plr(), pr.curve(), precision(), recall(), relative.entropy(), roc.curve(), shannon.entropy(), specificity(), zerooneloss()

Other Supervised Learning: accuracy(), auc.pr.curve(), auc.roc.curve(), baccuracy(), brier.score(), ccc(), ckappa(), cross.entropy(), deviance.gamma(), deviance.poisson(), deviance.tweedie(), dor(), fbeta(), fdr(), fer(), fmi(), fpr(), gmse(), hammingloss(), huberloss(), jaccard(), logloss(), maape(), mae(), mape(), mcc(), mpe(), mse(), nlr(), npv(), pinball(), plr(), pr.curve(), precision(), rae(), recall(), relative.entropy(), rmse(), rmsle(), roc.curve(), rrmse(), rrse(), rsq(), shannon.entropy(), smape(), specificity(), zerooneloss()

Examples

Run this code
## Classes and
## seed
set.seed(1903)
classes <- c("Kebab", "Falafel")

## Generate actual
## and predicted classes
actual_classes <- factor(
    x = sample(x = classes, size = 1e3, replace = TRUE),
    levels = c("Kebab", "Falafel")
)

predicted_classes <- factor(
    x = sample(x = classes, size = 1e3, replace = TRUE),
    levels = c("Kebab", "Falafel")
)

## Compute the confusion
## matrix
SLmetrics::cmatrix(
 actual    = actual_classes, 
 predicted = predicted_classes
)

Run the code above in your browser using DataLab