partition
object.plot.partition(x, ask = FALSE, which.plots = NULL,
nmax.lab = 40, max.strlen = 5,
cor = TRUE, stand = FALSE, lines = 2,
shade = FALSE, color = FALSE, labels = 0, plotchar = TRUE,
span = TRUE, xlim = NULL, ylim = NULL, ...)
which.plots
is NULL
,
plot.partition
operates in interactive mode, via menu
.which.plots
must contain
integers of 1
for a clusplot or 2
for
silhouette.clusplot.default
function (except for the diss
one) may also be supplied to
this function. Graphical parameters (see
ask= TRUE
, rather than producing each plot sequentially,
plot.partition
displays a menu listing all the plots that can be produced.
If the menu is not desired but a pause between plots is still wanted
one must set par(ask= TRUE)
before invoking the plot command.The clusplot of a cluster partition consists of a two-dimensional representation of the observations, in which the clusters are indicated by ellipses. (See clusplot.partition for more details.)
The silhouette plot of a nonhierarchical clustering is fully described in
Rousseeuw (1987) and in chapter 2 of Kaufman and Rousseeuw (1990).
For each observation i, a bar is drawn, representing the silhouette width s(i)
of the observation. Observations are grouped per cluster, starting with
cluster 1 at the top. Observations with a large s(i) (almost 1) are very well
clustered, a small s(i) (around 0) means that the observation lies between
two clusters, and observations with a negative s(i) are probably placed in
the wrong cluster.
A clustering can be performed for several values of k
(the number of
clusters). Finally, choose the value of k
with the largest overall
average silhouette width.
The silhouette width is computed as follows: Put a(i) = average dissimilarity between i and all other points of the cluster to which i belongs. For all clusters C, put d(i,C) = average dissimilarity of i to all observations of C. The smallest of these d(i,C) is denoted as b(i), and can be seen as the dissimilarity between i and its neighbor cluster. Finally, put s(i) = ( b(i) - a(i) ) / max( a(i), b(i) ). The overall average silhouette width is then simply the average of s(i) over all observations i.
Further, the references in plot.agnes
.
partition.object
, clusplot.partition
,
clusplot.default
, pam
,
pam.object
, clara
,
clara.object
, fanny
,
fanny.object
, par
.## generate 25 objects, divided into 2 clusters.
x <- rbind(cbind(rnorm(10,0,0.5), rnorm(10,0,0.5)),
cbind(rnorm(15,5,0.5), rnorm(15,5,0.5)))
plot(pam(x, 2))
Run the code above in your browser using DataLab