Learn R Programming

T4cluster (version 0.1.4)

quality.CH: (+) CH index

Description

Calinski and Harabasz index. Higher score means a good clustering.

Usage

quality.CH(data, cluster)

Value

an index value.

Arguments

data

an \((n\times p)\) matrix of row-stacked observations.

cluster

a length-\(n\) vector of class labels (from \(1:k\)).

Examples

Run this code
# \donttest{
# -------------------------------------------------------------
#        clustering validity check with 3 Gaussians
# -------------------------------------------------------------
## PREPARE
X1 = matrix(rnorm(30*3, mean=-5), ncol=3)
X2 = matrix(rnorm(30*3), ncol=3)
X3 = matrix(rnorm(30*3, mean=5), ncol=3)
XX = rbind(X1, X2, X3)

## CLUSTERING WITH DIFFERENT K VALUES & COMPUTE QUALITY INDICES
vec_k  = 2:10
vec_cl = rep(0, 9)
for (i in 1:length(vec_k)){
  cl_now    = T4cluster::kmeans(XX, k=vec_k[i])$cluster
  vec_cl[i] = quality.CH(XX, cl_now)
}


## VISUALIZE
opar <- par(no.readonly=TRUE)
plot(vec_k, vec_cl, type="b", lty=2, xlab="number of clusteres",
     ylab="score", main="CH index")
abline(v=3, lwd=2, col="red")
par(opar)
# }

Run the code above in your browser using DataLab