The AIBcat
function applies the Agglomerative Information Bottleneck algorithm to do hierarchical clustering of datasets containing only categorical variables, both nominal and ordinal. The algorithm uses an information-theoretic criterion to merge clusters so that information retention is maximised at each step to create meaningful clusters with maximal information about the original distribution.
To estimate the distributions of categorical features, the function utilizes specialized kernel functions, as follows:
$$K_u(x = x'; \lambda) = \begin{cases}
1 - \lambda, & \text{if } x = x' \\
\frac{\lambda}{\ell - 1}, & \text{otherwise}
\end{cases}, \quad 0 \leq \lambda \leq \frac{\ell - 1}{\ell},$$
where \(\ell\) is the number of categories, and \(\lambda\) controls the smoothness of the Aitchison & Aitken kernel for nominal variables aitchison_kernel_1976IBclust.
$$K_o(x = x'; \nu) = \begin{cases}
1, & \text{if } x = x' \\
\nu^{|x - x'|}, & \text{otherwise}
\end{cases}, \quad 0 \leq \nu \leq 1,$$
where \(\nu\) is the bandwidth parameter for ordinal variables, accounting for the ordinal relationship between categories li_nonparametric_2003IBclust.
Here, \(\lambda\), and \(\nu\) are bandwidth or smoothing parameters, while \(\ell\) is the number of levels of the categorical variable. The lambda parameter is automatically determined by the algorithm if not provided by the user. For ordinal variables, the lambda parameter of the function is used to define \(\nu\).