Computes the completeness between two clusterings, such
as a predicted and ground truth clustering.
Usage
completeness(true, pred)
Arguments
true
ground truth clustering represented as a membership
vector. Each entry corresponds to an element and the value identifies
the assigned cluster. The specific values of the cluster identifiers
are arbitrary.
pred
predicted clustering represented as a membership
vector.
Details
Completeness is an entropy-based measure of the similarity
between two clusterings, say \(t\) and \(p\). The completeness
is high if all members of a given cluster in \(t\) are assigned
to a single cluster in \(p\). The completeness ranges between 0
and 1, where 1 indicates perfect completeness.
References
Rosenberg, A. and Hirschberg, J. "V-measure: A conditional entropy-based external cluster evaluation measure." Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), (2007).
See Also
homogeneity evaluates the homogeneity, which is a dual
measure to completeness. v_measure evaluates the harmonic mean of
completeness and homogeneity.