KMeans

0th

Percentile

K-Means Clustering Using Multiple Random Seeds

Finds a number of k-means clusting solutions using R's kmeans function, and selects as the final solution the one that has the minimum total within-cluster sum of squared distances.

Keywords
misc
Usage
KMeans(x, centers, iter.max=10, num.seeds=10)
Arguments
x
A numeric matrix of data, or an object that can be coerced to such a matrix (such as a numeric vector or a dataframe with all numeric columns).
centers
The number of clusters in the solution.
iter.max
The maximum number of iterations allowed.
num.seeds
The number of different starting random seeds to use. Each random seed results in a different k-means solution.
Value

• A list with components:
• clusterA vector of integers indicating the cluster to which each point is allocated.
• centersA matrix of cluster centres (centroids).
• withinssThe within-cluster sum of squares for each cluster.
• tot.withinssThe within-cluster sum of squares summed across clusters.
• betweenssThe between-cluster sum of squared distances.
• sizeThe number of points in each cluster.

kmeans
data(USArrests)
KMeans(USArrests, centers=3, iter.max=5, num.seeds=5)