Hopkins statistic: If the value of Hopkins statistic is close to
1 (far above 0.5), then we can conclude that the dataset is significantly
clusterable. The statistic is calculated using the correct formula from
Cross and Jain (1982) with exponent d=D where D is the dimensionality
(number of columns) of the data. Under the null hypothesis of spatial
randomness, the Hopkins statistic follows a Beta(n, n) distribution.
Note on interpretation: This function returns the Hopkins statistic H
where values close to 1 indicate clusterable data. Some other packages (e.g.,
performance::check_clusterstructure) return 1-H, where values close to
0 indicate clusterability. Always check the documentation of the specific
implementation you are using.
Breaking change: factoextra uses the corrected Hopkins statistic
formula (Wright 2022). Results differ from legacy factoextra and a one-time
warning is emitted. Set options(factoextra.warn_hopkins = FALSE) to
silence the warning.
For large datasets, nearest-neighbor distances are computed with a low-memory
fallback when the full pairwise matrix would exceed
getOption("factoextra.hopkins.max_matrix_cells", 2e7) cells.
VAT (Visual Assessment of cluster Tendency): The VAT detects the
clustering tendency in a visual form by counting the number of square shaped
dark (or colored) blocks along the diagonal in a VAT image.