Learn R Programming

LCAvarsel (version 1.1)

compareCluster: Clustering comparison criteria

Description

Computes some criteria for comparing two classifications of the data points.

Usage

compareCluster(class1, class2)

Arguments

class1

A numeric or character vector of class labels.

class2

A numeric or character vector of class labels. Must be same length of class1.

Value

A list containing:

tab

The confusion matrix between the two clusterings.

jaccard

Jaccard index.

RI

Rand index.

ARI

Adjusted Rand index.

varInfo

Variation of information between the two clusterings.

Details

The Jaccard, Rand and adjusted Rand indices measure the agreement between two partitions of the units. These indices vary in the interval \([0,1]\) and a value of 1 corresponds to a perfect correspondence. Note that sometimes the adjusted Rand index could take negative values (see Hubert, Arabie, 1985). The variation of information is a measure of the distance between the two clusterings and a small value is indication of closeness.

References

Hubert, L. and Arabie, P. (1985). Comparing partitions. Journal of Classification, 2193-218.

Meila, M. (2007). Comparing clusterings - an information based distance. Journal of Multivariate Analysis, 98, 873-895.

Examples

Run this code
# NOT RUN {
cl1 <- sample(1:3, 100, replace = TRUE)
cl2 <- sample(letters[1:4], 100, replace = TRUE)
compareCluster(cl1, cl2)
compareCluster(cl1, cl1)   # perfect matching
# }

Run the code above in your browser using DataLab