# pam

##### Partitioning Around Medoids (PAM)

Return a partitioning (clustering) of the data into `k`

clusters.

- Keywords
- cluster

##### Usage

`pam(x, k, diss = inherits(x, "dist"), metric = "euclidean", stand = FALSE)`

##### Arguments

- x
- data matrix or data frame, or dissimilarity matrix or object,
depending on the value of the
`diss`

argument.In case of a matrix or data frame, each row corresponds to an observation, and each column corresponds to a variable. All

- k
- positive integer specifying the number of clusters, less than the number of observations.
- diss
- logical flag: if TRUE (default for
`dist`

or`dissimilarity`

objects), then`x`

will be considered as a dissimilarity matrix. If FALSE, then`x`

will be considered as a matrix of observations by var - metric
- character string specifying the metric to be used for calculating dissimilarities between observations. The currently available options are "euclidean" and "manhattan". Euclidean distances are root sum-of-squares of differences, and manhattan
- stand
- logical; if true, the measurements in
`x`

are standardized before calculating the dissimilarities. Measurements are standardized for each variable (column), by subtracting the variable's mean value and dividing by the variable's me

##### Details

`pam`

is fully described in chapter 2 of Kaufman and Rousseeuw (1990).
Compared to the k-means approach in `kmeans`

, the function `pam`

has
the following features: (a) it also accepts a dissimilarity matrix;
(b) it is more robust because it minimizes a sum of dissimilarities
instead of a sum of squared euclidean distances; (c) it provides a novel
graphical display, the silhouette plot (see `plot.partition`

)
which also allows to select the number of clusters.

The `pam`

-algorithm is based on the search for `k`

representative objects or
medoids among the observations of the dataset. These observations should
represent the structure of the data. After finding a set of `k`

medoids,
`k`

clusters are constructed by assigning each observation to the nearest
medoid. The goal is to find `k`

representative objects which minimize the
sum of the dissimilarities of the observations to their closest representative
object.
The algorithm first looks for a good initial set of medoids (this is called
the BUILD phase). Then it finds a local minimum for the objective function,
that is, a solution such that there is no single switch of an observation with
a medoid that will decrease the objective (this is called the SWAP phase).

##### Value

- an object of class
`"pam"`

representing the clustering. See`?pam.object`

for details.

##### Note

For datasets larger than (say) 200 observations, `pam`

will take a lot of
computation time. Then the function `clara`

is preferable.

##### See Also

`agnes`

for background and references;
`pam.object`

, `clara`

, `daisy`

,
`partition.object`

, `plot.partition`

,
`dist`

.

##### Examples

```
## generate 25 objects, divided into 2 clusters.
x <- rbind(cbind(rnorm(10,0,0.5), rnorm(10,0,0.5)),
cbind(rnorm(15,5,0.5), rnorm(15,5,0.5)))
pamx <- pam(x, 2)
pamx
summary(pamx)
plot(pamx)
pam(daisy(x, metric = "manhattan"), 2, diss = TRUE)
data(ruspini)
## Plot similar to Figure 4 in Stryuf et al (1996)
plot(pam(ruspini, 4), ask = TRUE)
<testonly>plot(pam(ruspini, 4))</testonly>
```

*Documentation reproduced from package cluster, version 1.4-1, License: GPL version 2 or later*