Learn R Programming

useful (version 1.2.6.1)

FitKMeans: Fit a series of kmeans clusterings and compute Hartigan's Number

Description

Given a numeric dataset this function fits a series of kmeans clusterings with increasing number of centers. k-means is compared to k+1-means using Hartigan's Number to determine if the k+1st cluster should be added.

Usage

FitKMeans(x, max.clusters = 12L, spectral = FALSE, nstart = 1L,
  iter.max = 10L, algorithm = c("Hartigan-Wong", "Lloyd", "Forgy",
  "MacQueen"), seed = NULL)

Value

A data.frame consisting of columns, for the number of clusters, the Hartigan Number and whether that cluster should be added, based on Hartigan's Number.

Arguments

x

The data, numeric, either a matrix or data.frame

max.clusters

The maximum number of clusters that should be tried

spectral

logical; If the data being fit are eigenvectors for spectral clustering

nstart

The number of random starts for the kmeans algorithm to use

iter.max

Maximum number of tries before the kmeans algorithm gives up on conversion

algorithm

The desired algorithm to be used for kmeans. Options are c("Hartigan-Wong", "Lloyd", "Forgy", "MacQueen"). See kmeans

seed

If not null, the random seed will be reset before each application of the kmeans algorithm

Author

Jared P. Lander www.jaredlander.com

Details

A consecutive series of kmeans is computed with increasing k (number of centers). Each result for k and k+1 are compared using Hartigan's Number. If the number is greater than 10, it is noted that having k+1 clusters is of value.

References

http://www.stat.columbia.edu/~madigan/DM08/descriptive.ppt.pdf

See Also

kmeans PlotHartigan

Examples

Run this code

data(iris)
hartiganResults <- FitKMeans(iris[, -ncol(iris)])
PlotHartigan(hartiganResults)

Run the code above in your browser using DataLab