findClusterNumber: Using SillyPutty to find the number of clusters
Description
A function that is designed to find an approximation of the true
number. K, of clusters in a dataset. the findClusterNumber
function calls RandomSillyPutty for each value of K in the
range from start to end, performing N random
starts each time.
NOTE: start must be > 1, and the function can be slow depending on how
complex the dataset is and the number of N iterations.
Usage
findClusterNumber(distobj, start,end, N = 100,
method = c("SillyPutty", "HCSP"), ...)
Value
A list containing the maximum silhouette width values per K clusters for
each K in the range of possible cluster numbers.
Arguments
distobj
An object of class dist representing a distance matrix.
start
The minimum cluster number for the range of clusters
end
The maximum cluster number for the range of clusters
N
Number of iterations
method
whether to use the full RandomSillyPutty
algorithm or use the hybrid method of hierarchical clustering followed
by SillyPutty.
The findClusterNumber function processes one distance matrix at
a time, through N iterations. It returns a list. The list is a
list of the maximum silhoutte width values obtained from N iterations
with their associated cluster number.