Last chance! 50% off unlimited learning
Sale ends in
Descriptive measures for analyzing objects of class
"dendrogram"
.
ntb(dendro)ultrametric(dendro)
mae(prox, ultr)
sdr(prox, ultr)
Object of class "dendrogram"
as produced by
linkage()
or by as.dendrogram()
applied to the hierarchical trees returned by hclust()
and agnes()
.
Object of class "dist"
containing the proximity data
used to build the dendrogram.
Object of class "dist"
containing the ultrametric
distances in the dendrogram, sorted in the same order as the proximity data
in prox
.
ntb
: Returns a number between 0 and 1 representing the normalized tree balance of
the input dendrogram.
ultrametric
: Returns an object of class "dist"
containing the
ultrametric distance matrix sorted in the same order as the proximity matrix
used to build the corresponding dendrogram.
mae
: Returns the normalized mean absolute error.
sdr
: Returns the space distortion ratio.
This package allows the calculation of several descriptive measures for dendrograms, such as normalized tree balance, cophenetic correlation coefficient, normalized mean absolute error, and space distortion ratio.
For each node in a dendrogram, its entropy is calculated using the
concept of Shannon's entropy, which gives a maximum entropy of 1 to nodes
merging subdendrograms with the same number of leaves. The average entropy
for all nodes in a dendrogram is called its tree balance. Normalized
tree balance is computed by the ntb()
function as the ratio
between the tree balance of a dendrogram and the minimum tree balance of any
dendrogram with the same number of elements. Perfectly balanced dendrograms
have a normalized tree balance equal to 1, while binary dendrograms formed
chaining one new element at a time have a normalized tree balance equal to 0.
To calculate the cophenetic correlation coefficient, the
cor()
function in the stats package needs that the
matrix of ultrametric distances (also known as cophenetic distances) and the
matrix of proximity data used to build the corresponding dendrogram, they
both have their rows and columns sorted in the same order. When the
cophenetic()
function is used with objects of class
"hclust"
, it returns ultrametric matrices sorted in
appropriate order. However, when the cophenetic()
function is used with objects of class "dendrogram"
, it
returns ultrametric matrices sorted in the order of dendrogram leaves. The
ultrametric()
function in this package returns ultrametric
matrices in appropriate order to calculate the cophenetic correlation
coefficient using the cor()
function.
The space distortion ratio of a dendrogram is computed by the
sdr()
function as the difference between the maximum and
minimum ultrametric distances, divided by the difference between the
maximum and minimum original distances used to build the dendrogram. Space
dilation occurs when the space distortion ratio is greater than 1.
linkage()
in this package, hclust()
in the
stats package, and agnes()
in the cluster
package for building hierarchical trees.
# NOT RUN {
## distances between 21 cities in Europe
data(eurodist)
## comparison of dendrograms in terms of the following descriptive mesures:
## - normalized tree balance
## - cophenetic correlation coefficient
## - normalized mean absolute error
## - space distortion ratio
## single linkage (call to the mdendro package)
dendro1 <- linkage(eurodist, method="single")
ntb(dendro1) # 0.2500664
ultr1 <- ultrametric(dendro1)
cor(eurodist, ultr1) # 0.7842797
mae(eurodist, ultr1) # 0.6352011
sdr(eurodist, ultr1) # 0.150663
## complete linkage (call to the stats package)
dendro2 <- as.dendrogram(hclust(eurodist, method="complete"))
ntb(dendro2) # 0.8112646
ultr2 <- ultrametric(dendro2)
cor(eurodist, ultr2) # 0.735041
mae(eurodist, ultr2) # 0.8469728
sdr(eurodist, ultr2) # 1
## unweighted arithmetic linkage (UPGMA)
dendro3 <- linkage(eurodist, method="arithmetic", weighted=FALSE)
ntb(dendro3) # 0.802202
ultr3 <- ultrametric(dendro3)
cor(eurodist, ultr3) # 0.7279432
mae(eurodist, ultr3) # 0.294578
sdr(eurodist, ultr3) # 0.5066903
## unweighted geometric linkage
dendro4 <- linkage(eurodist, method="geometric", weighted=FALSE)
ntb(dendro4) # 0.7531278
ultr4 <- ultrametric(dendro4)
cor(eurodist, ultr4) # 0.7419569
mae(eurodist, ultr4) # 0.2891692
sdr(eurodist, ultr4) # 0.4548112
# }
Run the code above in your browser using DataLab