summary.conf_mat: Summary Statistics for Confusion Matrices

Description

Various statistical summaries of confusion matrices are produced and returned in a easily used format. These potentially include those shown in the help pages for sens(), recall(), and accuracy().

Usage

# S3 method for conf_mat
summary(object, prevalence = NULL, beta = 1,
  wide = FALSE, ...)

Arguments

object

An object of class conf_mat().

prevalence

A number in (0, 1) for the prevalence (i.e. prior) of the event. If left to the default, the data are used to derive this value.

beta

A numeric value used to weight precision and recall for f_meas().

wide

A single logical value: should there be one row and columns for each statistic (wide = TRUE) or a column for the statistic name (name) and the estimate (value).

...

Not currently used.

Value

A tibble. Note that if the argument prevalence was used, the value reported in the tibble reflects the argument value and not the observed rate of events.

Details

There is no common convention on which factor level should automatically be considered the "event" or "positive" results. In yardstick, the default is to use the first level. To change this, a global option called yardstick.event_first is set to TRUE when the package is loaded. This can be changed to FALSE if the last level of the factor is considered the level of interest.

Examples

Run this code

# NOT RUN {
data("two_class_example")

cmat <- conf_mat(two_class_example, truth = "truth", estimate = "predicted")
summary(cmat, wide = TRUE)
summary(cmat, wide = TRUE, prevalence = 0.70)

library(dplyr)
data("hpc_cv")

# Compute statistics per resample then summarize
hpc_cv %>%
  group_by(Resample) %>%
  do(summary(conf_mat(., truth = "obs", estimate = "pred"))) %>%
  group_by(name) %>%
  summarize(mean = mean(value, na.rm = TRUE),
            sd = sd(value, na.rm = TRUE))
            
# }

Run the code above in your browser using DataLab