cl_validity(x, ...)
"cl_validity"(x, d, ...)x was obtained."cl_validity" with the computed validity
measures.
cl_validity is a generic function.For partitions, its default method gives the dissimilarity accounted for, defined as $1 - a_w / a_t$, where $a_t$ is the average total dissimilarity, and the average within dissimilarity $a_w$ is given by $$\frac{\sum_{i,j} \sum_k m_{ik}m_{jk} d_{ij}}{ \sum_{i,j} \sum_k m_{ik}m_{jk}}$$ where $d$ and $m$ are the dissimilarities and memberships, respectively, and the sums are over all pairs of objects and all classes.
For hierarchies, the validity measures computed by default are
variance accounted for (VAF, e.g., Hubert, Arabie & Meulman,
2006) and deviance accounted for (DEV, e.g., Smith, 2001).
If u is the ultrametric corresponding to the hierarchy x
and d the dissimilarity x was obtained from, these
validity measures are given by
$$\mathrm{VAF} =
\max\left(0, 1 - \frac{\sum_{i,j} (d_{ij} - u_{ij})^2}{
\sum_{i,j} (d_{ij} - \mathrm{mean}(d)) ^ 2}\right)$$
and
$$\mathrm{DEV} =
\max\left(0, 1 - \frac{\sum_{i,j} |d_{ij} - u_{ij}|}{
\sum_{i,j} |d_{ij} - \mathrm{median}(d)|}\right)$$
respectively. Note that VAF and DEV are not invariant under rescaling
u, and may be arbitrarily small (i.e., 0 using the
above definitions) even though u and d are
structurally close in some sense.
For the results of using agnes and
diana, the agglomerative and divisive
coefficients are provided in addition to the default ones.
T. J. Smith (2001). Constructing ultrametric and additive trees based on the $L_1$ norm. Journal of Classification, 18/2, 185--207.
cluster.stats in package fpc for a variety of
cluster validation statistics;
fclustIndex in package e1071 for several
fuzzy cluster indexes;
clustIndex in package cclust;
silhouette in package cluster.