kmedoids: K-Medoids Clustering

Description

Compute a \(k\)-medoids partition of a dissimilarity object.

Usage

kmedoids(x, k)

Value

An object of class "kmedoids" representing the obtained partition, which is a list with the following components.

cluster: the class ids of the partition.
medoid_ids: the indices of the medoids.
criterion: the value of the criterion function of the partition.

Arguments

x: a dissimilarity object inheriting from class "dist", or a square matrix of pairwise object-to-object dissimilarity values.
k: an integer giving the number of classes to be used in the partition.

Details

Let \(d\) denote the pairwise object-to-object dissimilarity matrix corresponding to x. A \(k\)-medoids partition of x is defined as a partition of the numbers from 1 to \(n\), the number of objects in x, into \(k\) classes \(C_1, \ldots, C_k\) such that the criterion function \(L = \sum_l \min_{j \in C_l} \sum_{i \in C_l} d_{ij}\) is minimized.

This is an NP-hard optimization problem. PAM (Partitioning Around Medoids, see |Kaufman+Rousseeuw:1990|Chapter 2) is a very popular heuristic for obtaining optimal \(k\)-medoids partitions, and provided by pam in package cluster.

kmedoids is an exact algorithm based on a binary linear programming formulation of the optimization problem e.g.|Gordon+Vichi:1998|[P4'], using lp from package lpSolve as solver. Depending on available hardware resources (the number of constraints of the program is of the order \(n^2\)), it may not be possible to obtain a solution.

References

Kaufman+Rousseeuw:1990, Gordon+Vichi:1998