Initializes the cluster prototypes matrix with the Simple Cluster Seeking (SCS) algorithm (Tou & Gonzales, 1974) over a selected feature.
scseek2(x, k, sfidx, tv)
a numeric vector, data frame or matrix.
an integer for the number of clusters.
an integer specifying the column index of the selected feature for random sampling. If missing, it is internally determined by comparing the coefficients of variation of all features in the data set. The feature having the maximum coefficent of variation is used as the selected feature.
a number to be used as the threshold distance which is directly input by the user. Also it is possible to compute T, a threshold distance value with the following options of tv
argument:
T is the mean of differences between the consecutive pairs of objects with the option cd1.
T is the minimum of differences between the consecutive pairs of objects with the option cd2.
T is the mean of Euclidean distances between the consecutive pairs of objects divided into k with the option md. This is the default if tv
is not supplied by the user.
T is the range of maximum and minimum of Euclidean distances between the consecutive pairs of objects divided into k with the option mm.
an object of class ‘inaparc’, which is a list consists of the following items:
a numeric matrix of the initial cluster prototypes.
an integer for the column index of the selected feature, which used for random sampling.
a string representing the type of centroid, which used to build prototype matrix. Its value is ‘obj’ with this function because the cluster prototype matrix contains the sampled objects.
a string containing the matched function call that generates this ‘proclus’ object.
The scseek2
is a novel variant of the function scseek
based on the Simple Cluster Seeking (SCS) algorithm (Tou & Gonzales, 1974). It differs from SCS that the distances and threshold value are computed over a selected feature having the maximum coefficient of variation, instead of using all the features.
Tou, J.T. & Gonzalez,R.C. (1974). Pattern Recognition Principles. Addison-Wesley, Reading, MA. <ISBN:9780201075861>
aldaoud
,
ballhall
,
crsamp
,
firstk
,
forgy
,
hartiganwong
,
inofrep
,
inscsf
,
insdev
,
kkz
,
kmpp
,
ksegments
,
ksteps
,
lastk
,
lhsmaximin
,
lhsrandom
,
maximin
,
mscseek
,
rsamp
,
rsegment
,
scseek
,
spaeth
,
ssamp
,
topbottom
,
uniquek
,
ursamp
# NOT RUN {
data(iris)
# Run over 4th feature with the threshold value of 0.5
res <- scseek2(x=iris[,1:4], k=5, sfidx=4, tv=0.5)
v1 <- res$v
print(v1)
# Run with the internally computed default threshold value
res <- scseek2(x=iris[,1:4], k=5)
v2 <- res$v
print(v2)
# }
Run the code above in your browser using DataLab