clustering(y, disMethod = "Euclidean")
point[1]
.
We then get the nearest neighbor of point[1]
. Store it in
point[2]
. Store the dissimilarity between point[1]
and
point[2]
to db[1]
. We next remove point[1]
.
We then find the nearest neighbor of point[2]
.
Store it in point[3]
. Store the dissimilarity between point[2]
and point[3]
to db[2]
. We then remove point[2]
and find the nearest neighbor of point[3]
. We repeat this procudure
until we find point[n]
and db[n-1]
where n
is the
total number of data points.Next, we calculate the interquartile range (IQR) of the vector db
.
We then check which elements of db
are larger than avg+1.5IQR
where avg
is the average of the vector db
. The mininum value of
these outlier dissimilarities will be stored in omin
.
An estimate of the number of clusters is g
where g-1
is the number
of the outlier dissimilarities.
The position of an outlier dissimilarity
indicates the end of a cluster and the start of a new cluster.
To get a reasonable clustering result, data sharpening (shrinking) is recommended before data clustering.
shrinking
# ruspini data
data(Ruspini)
# data matrix
ruspini <- Ruspini$ruspini
tt <- clustering(ruspini)
plotClusters(ruspini, tt$mem)
Run the code above in your browser using DataLab