Compute a \(k\)-medoids partition of a dissimilarity object.
kmedoids(x, k)
a dissimilarity object inheriting from class
"dist"
, or a square matrix of pairwise
object-to-object dissimilarity values.
an integer giving the number of classes to be used in the partition.
An object of class "kmedoids"
representing the obtained
partition, which is a list with the following components.
the class ids of the partition.
the indices of the medoids.
the value of the criterion function of the partition.
Let \(d\) denote the pairwise object-to-object dissimilarity matrix
corresponding to x
. A \(k\)-medoids partition of x
is
defined as a partition of the numbers from 1 to \(n\), the number of
objects in x
, into \(k\) classes \(C_1, \ldots, C_k\) such
that the criterion function
\(L = \sum_l \min_{j \in C_l} \sum_{i \in C_l} d_{ij}\)
is minimized.
This is an NP-hard optimization problem. PAM (Partitioning Around
Medoids, see Kaufman & Rousseeuw (1990), Chapter 2) is a very popular
heuristic for obtaining optimal \(k\)-medoids partitions, and
provided by pam
in package cluster.
kmedoids
is an exact algorithm based on a binary linear
programming formulation of the optimization problem (e.g., Gordon &
Vichi (1998), [P4']), using lp
from package
lpSolve as solver. Depending on available hardware resources
(the number of constraints of the program is of the order \(n^2\)),
it may not be possible to obtain a solution.
L. Kaufman and P. J. Rousseeuw (1990). Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York.
A. D. Gordon and M. Vichi (1998). Partitions of partitions. Journal of Classification, 15, 265--285. 10.1007/s003579900034.