Rand.index: Rand Index of Agreement Between Two Partitions

Description

Calculates the Rand Index between two partitions of a set

Usage

Rand.index(x, y)

Value

The Rand index (not adjusted for chance)

Arguments

x: first partition vector
y: second partition vector

Details

The two vectors x and y must have equal length. Given a set \(S\) and two partitions \(X\) and \(Y\) of \(S\), the Rand index is the proportion of pairs of elements in \(S\) (out of all pairs) that are either concordant in both \(X\) and \(Y\) (i.e., they belong to the same member of \(X\) and to the same member of \(Y\)) or discordant (i.e., not concordant) in both \(X\) and Y.

References

W. M. Rand (1971). "Objective criteria for the evaluation of clustering methods"
https://en.wikipedia.org/wiki/Rand_index

Examples

Run this code

## Example 1
x <- sample.int(3, 20, replace = TRUE)
y <- sample.int(3, 20, replace = TRUE)
table(x,y)
Rand.index(x,y)

## Example 2
data(optdigits)
label <- optdigits$label 
m <- length(unique(label)) # 10 
n <- length(unique(optdigits$unit)) # 100
dim(label) <- c(m,n)
p <- ncol(optdigits$x) # 64
x <- array(t(optdigits$x),c(p,m,n))
## Permute data and labels to make problem harder
for (i in 1:n) {
	sigma <- sample.int(m)
	x[,,i] <- x[,sigma,i]
	label[,i] <- label[sigma,i]
}
## Compare Rand indices of matching methods
Rand.index(match.bca(x)$cluster, label)
Rand.index(match.rec(x)$cluster, label)
Rand.index(match.template(x)$cluster, label)
Rand.index(match.kmeans(x)$cluster, label)

Run the code above in your browser using DataLab