cnfm: Confusion matrix

Description

cnfm computes the confusion matrix of the clustering with respect to an expert/reference labeling of the data. Also, it can be used to compare the labelings of two different clusterings of the same trajectory, (see details).

Usage

cnfm(obj, ref, ...)
# S4 method for binClst,numeric
cnfm(obj, ref, ret = FALSE, ...)
# S4 method for binClstPath,missing
cnfm(obj, ref, ret = FALSE, ...)
# S4 method for binClstStck,missing
cnfm(obj, ref, ret = FALSE, ...)
# S4 method for binClst,binClst
cnfm(obj, ref, ret = FALSE, ...)

Value

If ret=TRUE returns a matrix with the confusion matrix values.

Arguments

obj

A binClst_instance or bnClstStck instance.

ref

A numeric vector with an expert/reference labeling of the data.

A second binClst_instance (see details).

...

Parameters ref and ret are optional.

ret

A boolean value (defaults to FALSE). If ret=TRUE the confusion matrix is returned as a matrix object.

Details

The confusion matrix yields marginal counts and Recall for each row, and marginal counts, Precision and class F-measure for each column. The 3x2 subset of cells at the bottom right show (in this order): the overall Accuracy, the average Recall, the average Precision, NaN, NaN, and the overall Macro-F-Measure. The number of classes (expert/reference labeling) should match or, at least not be greater than the number of clusters. The overall value of the Macro-F-Measure is an average of the class F-measure values, hence it is underestimated if the number of classes is lower than the number of clusters.

If obj is a binClstPath_instance and there is a column "lbl" in the obj@pth slot with an expert labeling, this labeling will be used by default.

If obj is a binClstStck instance and, for all paths in the stack, there is a column "lbl" in the obj@pth slot of each, this labeling will be used to compute the confusion matrix for the whole stack.

If obj and ref are both a binClst_instance (e.g. smoothed versus non-smoothed), the confusion matrix compares both labelings.

Examples

Run this code

# -- apply EMbC to the example path --
mybcp <- stbc(expth,info=-1)
# -- compute the confusion matrix --
cnfm(mybcp,expth$lbl)
# -- as we have expth$lbl the following also works --
cnfm(mybcp,mybcp@pth$lbl)
# -- or simply --
cnfm(mybcp)
# -- numerical differences with respect to the smoothed clustering --
cnfm(mybcp,smth(mybcp))

Run the code above in your browser using DataLab