A function for calculation of a proximity (dissimilarity) matrix based on the G1 similarity measure.
good1(data)
A data.frame or a matrix with cases in rows and variables in colums.
The function returns a dissimilarity matrix of the size n x n
, where n
is the number of objects in the original dataset in the argument data
.
The Goodall 1 similarity measure was presented in (Boriah et al., 2008). It is a simple modification of the original Goodall measure (Goodall, 1966). The measure assigns higher weights to infrequent matches.
Boriah S., Chandola V., Kumar V. (2008). Similarity measures for categorical data: A comparative evaluation. In: Proceedings of the 8th SIAM International Conference on Data Mining, SIAM, p. 243-254.
Goodall V.D. (1966). A new similarity index based on probability. Biometrics, 22(4), p. 882.
eskin
,
good2
,
good3
,
good4
,
iof
,
lin
,
lin1
,
morlini
,
of
,
sm
,
ve
,
vm
.
# NOT RUN {
# sample data
data(data20)
# dissimilarity matrix calculation
prox.good1 <- good1(data20)
# }
Run the code above in your browser using DataLab