cluster.stats(d,clustering,alt.clustering=NULL,
silhouette=TRUE,G2=FALSE,G3=FALSE)dist) or a distance
matrix between cases.clustering, indicating an alternative clustering. If provided, the
corrected rand index for clustering
vs. alt.clustering is computed.TRUE, the silhouette statistics
are computed, which requires package cluster.TRUE, Goodman and Kruskal's index G2
(cf. Gordon (1999), p. 62) is computed. This executes lots of
sorting algorithms and can be very slow (it has been improved
by R. Francois - thanks!)TRUE, the index G3
(cf. Gordon (1999), p. 62) is computed. This executes sort
on all distances and can be extremely slow.cluster.stats returns a list containing the components
n, cluster.number, cluster.size, diameter,
average.distance, median.distance, separation, average.toother,
separation.matrix, average.between, average.within,
n.between, n.within, clus.avg.silwidths, avg.silwidth,
g2, g3, hubertgamma, dunn, wb.ratio, corrected.rand.silhouette.silhouette.average.within/average.between.alt.clustering
has been specified), see Gordon (1999, p. 198).Haldiki, M., Batistakis, Y., Vazirgiannis, M. (2002) Cluster validity methods, SIGMOD, Record 31, 40-45. Milligan, G. W. and Cooper, M. C. (1985) An examination of procedures for determining the number of clusters. Psychometrika, 50, 159-179.
silhouette, dist
clusterboot computes clusterwise stability statistics by
resampling.set.seed(20000)
face <- rFace(200,dMoNo=2,dNoEy=0,p=2)
dface <- dist(face)
complete3 <- cutree(hclust(dface),3)
cluster.stats(dface,complete3,
alt.clustering=as.integer(attr(face,"grouping")))Run the code above in your browser using DataLab