50% off | Unlimited Data & AI Learning

Last chance! 50% off unlimited learning

Sale ends in


clue (version 0.3-27)

kmedoids: K-Medoids Clustering

Description

Compute a $k$-medoids partition of a dissimilarity object.

Usage

kmedoids(x, k)

Arguments

x
a dissimilarity object inheriting from class "dist", or a square matrix of pairwise object-to-object dissimilarity values.
k
an integer giving the number of classes to be used in the partition.

Value

  • An object of class "kmedoids" representing the obtained partition, which is a list with the following components.
  • clusterthe class ids of the partition.
  • medoid_idsthe indices of the medoids.
  • criterionthe value of the criterion function of the partition.

Details

Let $d$ denote the pairwise object-to-object dissimilarity matrix corresponding to x. A $k$-medoids partition of x is defined as a partition of the numbers from 1 to $n$, the number of objects in x, into $k$ classes $C_1, \ldots, C_k$ such that the criterion function $L = \sum_l \min_{j \in C_l} \sum_{i \in C_l} d_{ij}$ is minimized.

This is an NP-hard optimization problem. PAM (Partitioning Around Medoids, see Kaufman & Rousseeuw (1990), Chapter 2) is a very popular heuristic for obtaining optimal $k$-medoids partitions, and provided by pam in package cluster.

kmedoids is an exact algorithm based on a binary linear programming formulation of the optimization problem (e.g., Gordon & Vichi (1998), [P4']), using lp from package lpSolve as solver. Depending on available hardware resources (the number of constraints of the program is of the order $n^2$), it may not be possible to obtain a solution.

References

L. Kaufman and P. J. Rousseeuw (1990). Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York.

A. D. Gordon and M. Vichi (1998). Partitions of partitions. Journal of Classification, 15, 265--285.