kmeans
to perform a k-means
clustering, but initializes the k-means algorithm several times with
random points from the data set as means. Furthermore, it is more
robust against the occurrence of empty clusters in the algorithm and
it estimates the number of clusters by either the Calinski Harabasz
index (calinhara
) or average silhouette width (see
pam.object
). The Duda-Hart test
(dudahart2
) is applied to decide whether there should be
more than one cluster (unless 1 is excluded as number of clusters).kmeansruns(data,krange=2:10,criterion="ch",
iter.max=100,runs=100,
scaledata=FALSE,alpha=0.001,
critout=FALSE,plot=FALSE,...)
nc=1
. If 1 is included, a Duda-Hart tes"asw"
or "ch"
. Determines
whether average silhouette width or Calinski-Harabasz is applied.TRUE
, the variables are centered
and scaled to unit variance before execution.dudahart2
(only used for 1-cluster test).TRUE
, the criterion value is printed
out for every number of clusters.TRUE
, every clustering resulting from a
run of the algorithm is plotted.kmeans
.kmeans
-function
with added components bestk
and crit
.
A list with componentscriterion
for all used numbers of
clusters (0 if number not tried).Duda, R. O. and Hart, P. E. (1973) Pattern Classification and Scene Analysis. Wiley, New York.
Hartigan, J. A. and Wong, M. A. (1979). A K-means clustering algorithm. Applied Statistics, 28, 100-108.
Kaufman, L. and Rousseeuw, P.J. (1990). "Finding Groups in Data: An Introduction to Cluster Analysis". Wiley, New York.
kmeans
, pamk
,
calinhara
, dudahart2
)set.seed(20000)
face <- rFace(50,dMoNo=2,dNoEy=0,p=2)
pka <- kmeansruns(face,krange=1:5,critout=TRUE,runs=2,criterion="asw")
pkc <- kmeansruns(face,krange=1:5,critout=TRUE,runs=2,criterion="ch")
Run the code above in your browser using DataLab