Learn R Programming

BCA (version 0.9-2)

bootCVD: Cluster Solution Diagnositics Using Bootstrap Replicates

Description

Provides a plot of both the Rand index and the Calinski-Harabas index for different numbers of clusters for a common underlying dataset using either the K-Means, K-Medians, or Neural Gas clusting algorithms based on a set of bootstrap replicates of the data.

Usage

bootCVD(x, k, nboot=100, nrep=1, method = c("kmn", "kmd", "neuralgas"),
   col1, col2, dsname)
bootCH(xdat, k_vals, clstr1, clstr2, cntrs1, cntrs2,
   method = c("kmn", "kmd", "neuralgas"))
bootPlot(fc, ch, col1="blue", col2="green")

Arguments

Value

The functions bootCVD and bootPlot return invisibly. Their benefit is the side effect plot produced and the printed summary of the index values. The function bootCH a matrix of Calinski-Harabas index values, the rows are the replicates, and each column corresponds to a particular number of clusters for a solution.

Details

The Rand index provides a measure of cluster stability, with relatively higher values indicating relatively more stable clusters, and the the Calinski-Harabas index gives a ratio of cluster seperation to cluster homogeneity, with higher values of the index being comparatively more preferred. The use of bootstrap replicates addresses both potential randomness in both the sample data and the clustering algorithms.

References

S. Dolnicar, F. Leisch (2010), Evaluation of Structure and Reproducibility of Cluster Solution Using the Bootstrap. Marketing Letters, 21:1. F. Leisch (2006), A Toolbox for K-Centroids Cluster Analysis. Computational Statistics and Data Analysis, 51:2.

See Also

bootFlexclust