classAgreement()
computes several coefficients of agreement
between the columns and rows of a 2-way contingency table.
classAgreement(tab, match.names=FALSE)
A 2-dimensional contingency table.
Flag whether row and columns should be matched by name.
A list with components
Percentage of data points in the main diagonal of tab
.
diag
corrected for agreement by chance.
Rand index.
Rand index corrected for agreement by chance.
Suppose we want to compare two classifications summarized by the
contingency table match.names
is TRUE
, the class labels
as given by the row and column names are matched, i.e. only columns and
rows with the same dimnames are used for the computation.
If the two classification do not use the same set of labels, or if
identical labels can have different meaning (e.g., two outcomes of
cluster analysis on the same data set), then the situation is a little
bit more complicated. Let
Both indices have to be corrected for agreement by chance if the sizes of the classes are not uniform.
J.~Cohen. A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37--46, 1960.
Lawrence Hubert and Phipps Arabie. Comparing partitions. Journal of Classification, 2, 193--218, 1985.
# NOT RUN {
## no class correlations: both kappa and crand almost zero
g1 <- sample(1:5, size=1000, replace=TRUE)
g2 <- sample(1:5, size=1000, replace=TRUE)
tab <- table(g1, g2)
classAgreement(tab)
## let pairs (g1=1,g2=1) and (g1=3,g2=3) agree better
k <- sample(1:1000, size=200)
g1[k] <- 1
g2[k] <- 1
k <- sample(1:1000, size=200)
g1[k] <- 3
g2[k] <- 3
tab <- table(g1, g2)
## both kappa and crand should be significantly larger than before
classAgreement(tab)
# }
Run the code above in your browser using DataLab