
Last chance! 50% off unlimited learning
Sale ends in
Calculates a cross-tabulation of observed and predicted classes.
For conf_mat()
objects, the tidy
method collapses the cell
counts by cell into a data frame for each manipulation.
conf_mat(data, ...)# S3 method for data.frame
conf_mat(data, truth, estimate, dnn = c("Prediction",
"Truth"), ...)
# S3 method for table
conf_mat(data, ...)
# S3 method for conf_mat
tidy(x, ...)
A data frame or a base::table()
.
Options to pass to base::table()
(not including
dnn
). This argument is not currently used for the tidy
method.
The column identifier for the true class results (that is a factor). This should an unquoted column name although this argument is passed by expression and support quasiquotation (you can unquote column names or column positions).
The column identifier for the predicted class
results (that is also factor). As with truth
this can be
specified different ways but the primary method is to use an
unquoted variable name.
a character vector of dimnames for the table
A object of class conf_mat()
.
conf_mat
produces a object with class conf_mat
.
This contains the table and other objects. tidy.conf_mat
generates a tibble with columns name
(the cell identifier) and
value
(the cell count).
The function requires that the factors have exactly the same levels.
# NOT RUN {
library(dplyr)
data("hpc_cv")
# The confusion matrix from a single assessment set (i.e. fold)
hpc_cv %>%
filter(Resample == "Fold01") %>%
conf_mat(obs, pred)
# Now compute the average confusion matrix across all folds in
# terms of the proportion of the data contained in each cell.
# First get the raw cell counts per fold using the `tidy` method
cells_per_resample <- hpc_cv %>%
group_by(Resample) %>%
do(tidy(conf_mat(., obs, pred)))
# Get the totals per resample
counts_per_resample <- hpc_cv %>%
group_by(Resample) %>%
summarize(total = n()) %>%
left_join(cells_per_resample, by = "Resample") %>%
# Compute the proportions
mutate(prop = value/total) %>%
group_by(name) %>%
# Average
summarize(prop = mean(prop))
counts_per_resample
# Now reshape these into a matrix
mean_cmat <- matrix(counts_per_resample$prop, byrow = TRUE, ncol = 4)
rownames(mean_cmat) <- levels(hpc_cv$obs)
colnames(mean_cmat) <- levels(hpc_cv$obs)
round(mean_cmat, 3)
# }
Run the code above in your browser using DataLab