Calinski-Harabasz index for estimating the number of clusters,
based on an observations/variables-matrix here. A distance based
version is available through cluster.stats
.
calinhara(x,clustering,cn=max(clustering))
data matrix or data frame.
vector of integers. Clustering.
integer. Number of clusters.
Calinski-Harabasz statistic, which is
(n-cn)*sum(diag(B))/((cn-1)*sum(diag(W)))
. B being the
between-cluster means,
and W being the within-clusters covariance matrix.
Calinski, T., and Harabasz, J. (1974) A Dendrite Method for Cluster Analysis, Communications in Statistics, 3, 1-27.
# NOT RUN {
set.seed(98765)
iriss <- iris[sample(150,20),-5]
km <- kmeans(iriss,3)
round(calinhara(iriss,km$cluster),digits=2)
# }
Run the code above in your browser using DataLab