The AIBcat
function applies the Agglomerative Information Bottleneck algorithm to do hierarchical clustering of datasets containing only continuous variables, both nominal and ordinal. The algorithm uses an information-theoretic criterion to merge clusters so that information retention is maximised at each step to create meaningful clusters with maximal information about the original distribution.
The function utilizes the Gaussian kernel silverman_density_1998IBclust for estimating probability densities of continuous features. The kernel is defined as:
$$K_c\left(\frac{x - x'}{s}\right) = \frac{1}{\sqrt{2\pi}} \exp\left\{-\frac{\left(x - x'\right)^2}{2s^2}\right\}, \quad s > 0.$$
The bandwidth parameter \(s\), which controls the smoothness of the density estimate, is automatically determined by the algorithm if not provided by the user.