Learn R Programming

clevr (version 0.1.2)

rand_index: Rand Index Between Clusterings

Description

Computes the Rand index (RI) between two clusterings, such as a predicted and ground truth clustering.

Usage

rand_index(true, pred)

Arguments

true

ground truth clustering represented as a membership vector. Each entry corresponds to an element and the value identifies the assigned cluster. The specific values of the cluster identifiers are arbitrary.

pred

predicted clustering represented as a membership vector.

Details

The Rand index (RI) can be expressed as: $$\frac{a + b}{{n \choose 2}}.$$ where

  • \(n\) is the number of elements,

  • \(a\) is the number of pairs of elements that appear in the same cluster in both clusterings, and

  • \(b\) is the number of pairs of elements that appear in distinct clusters in both clusterings.

The RI takes on values between 0 and 1, where 1 denotes exact agreement between the clusterings and 0 denotes disagreement on all pairs of elements.

References

Rand, W. M. "Objective Criteria for the Evaluation of Clustering Methods." Journal of the American Statistical Association 66(336), 846-850 (1971). tools:::Rd_expr_doi("10.1080/01621459.1971.10482356")

Examples

Run this code
true <- c(1,1,1,2,2)  # ground truth clustering
pred <- c(1,1,2,2,2)  # predicted clustering
rand_index(true, pred)

Run the code above in your browser using DataLab