This calls the function kmeans to perform a k-means
  clustering, but initializes the k-means algorithm several times with
  random points from the data set as means. Furthermore, it is more
  robust against the occurrence of empty clusters in the algorithm and
  it estimates the number of clusters by either the Calinski Harabasz
  index (calinhara) or average silhouette width (see
  pam.object). The Duda-Hart test
  (dudahart2) is applied to decide whether there should be
  more than one cluster (unless 1 is excluded as number of clusters).
kmeansruns(data,krange=2:10,criterion="ch",
                       iter.max=100,runs=100,
                       scaledata=FALSE,alpha=0.001,
                       critout=FALSE,plot=FALSE,...)A numeric matrix of data, or an object that can be coerced to such a matrix (such as a numeric vector or a data frame with all numeric columns).
integer vector. Numbers of clusters which are to be
    compared by the average silhouette width criterion. Note: average
    silhouette width and Calinski-Harabasz can't estimate number of
    clusters nc=1. If 1 is included, a Duda-Hart test is applied
    and 1 is estimated if this is not significant.
one of "asw" or "ch". Determines
    whether average silhouette width or Calinski-Harabasz is applied.
integer. The maximum number of iterations allowed.
integer. Number of starts of the k-means algorithm.
logical. If TRUE, the variables are centered
    and scaled to unit variance before execution.
numeric between 0 and 1, tuning constant for
    dudahart2 (only used for 1-cluster test).
logical. If TRUE, the criterion value is printed
    out for every number of clusters.
logical. If TRUE, every clustering resulting from a
    run of the algorithm is plotted.
further arguments to be passed on to kmeans.
The output of the optimal run of the kmeans-function
  with added components bestk and crit.
  A list with components
A vector of integers indicating the cluster to which each point is allocated.
A matrix of cluster centers.
The within-cluster sum of squares for each cluster.
The number of points in each cluster.
The optimal number of clusters.
Vector with values of the criterion for all used numbers of
  clusters (0 if number not tried).
Calinski, T., and Harabasz, J. (1974) A Dendrite Method for Cluster Analysis, Communications in Statistics, 3, 1-27.
Duda, R. O. and Hart, P. E. (1973) Pattern Classification and Scene Analysis. Wiley, New York.
Hartigan, J. A. and Wong, M. A. (1979). A K-means clustering algorithm. Applied Statistics, 28, 100-108.
Kaufman, L. and Rousseeuw, P.J. (1990). "Finding Groups in Data: An Introduction to Cluster Analysis". Wiley, New York.
# NOT RUN {
  options(digits=3)
  set.seed(20000)
  face <- rFace(50,dMoNo=2,dNoEy=0,p=2)
  pka <- kmeansruns(face,krange=1:5,critout=TRUE,runs=2,criterion="asw")
  pkc <- kmeansruns(face,krange=1:5,critout=TRUE,runs=2,criterion="ch")
# }
Run the code above in your browser using DataLab