Learn R Programming

clevr (version 0.1.2)

v_measure: V-measure Between Clusterings

Description

Computes the V-measure between two clusterings, such as a predicted and ground truth clustering.

Usage

v_measure(true, pred, beta = 1)

Arguments

true

ground truth clustering represented as a membership vector. Each entry corresponds to an element and the value identifies the assigned cluster. The specific values of the cluster identifiers are arbitrary.

pred

predicted clustering represented as a membership vector.

beta

non-negative weight. A value of 0 assigns no weight to completeness (i.e. the measure reduces to homogeneity), while larger values assign increasing weight to completeness. A value of 1 weights completeness and homogeneity equally.

Details

V-measure is defined as the \(\beta\)-weighted harmonic mean of homogeneity \(h\) and completeness \(c\): $$(1 + \beta)\frac{h \cdot c}{\beta \cdot h + c}.$$ The range of V-measure is between 0 and 1, where 1 corresponds to a perfect match between the clusterings. It is equivalent to the normalised mutual information, when the aggregation function is the arithmetic mean.

References

Rosenberg, A. and Hirschberg, J. "V-measure: A conditional entropy-based external cluster evaluation measure." Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), (2007).

Becker, H. "Identification and characterization of events in social media." PhD dissertation, Columbia University, (2011).

See Also

homogeneity and completeness evaluate the component measures upon which this measure is based.

Examples

Run this code
true <- c(1,1,1,2,2)  # ground truth clustering
pred <- c(1,1,2,2,2)  # predicted clustering
v_measure(true, pred)

Run the code above in your browser using DataLab