mdendro (version 1.0.1)

dendesc: Dendrogram Descriptive Measures

Description

Descriptive measures for analyzing objects of class "dendrogram".

Usage

ntb(dendro)

ultrametric(dendro)

mae(prox, ultr)

sdr(prox, ultr)

Arguments

dendro

Object of class "dendrogram" as produced by linkage() or by as.dendrogram() applied to the hierarchical trees returned by hclust() and agnes().

prox

Object of class "dist" containing the proximity data used to build the dendrogram.

ultr

Object of class "dist" containing the ultrametric distances in the dendrogram, sorted in the same order as the proximity data in prox.

Functions

  • ntb: Returns a number between 0 and 1 representing the normalized tree balance of the input dendrogram.

  • ultrametric: Returns an object of class "dist" containing the ultrametric distance matrix sorted in the same order as the proximity matrix used to build the corresponding dendrogram.

  • mae: Returns the normalized mean absolute error.

  • sdr: Returns the space distortion ratio.

Details

This package allows the calculation of several descriptive measures for dendrograms, such as normalized tree balance, cophenetic correlation coefficient, normalized mean absolute error, and space distortion ratio.

For each node in a dendrogram, its entropy is calculated using the concept of Shannon's entropy, which gives a maximum entropy of 1 to nodes merging subdendrograms with the same number of leaves. The average entropy for all nodes in a dendrogram is called its tree balance. Normalized tree balance is computed by the ntb() function as the ratio between the tree balance of a dendrogram and the minimum tree balance of any dendrogram with the same number of elements. Perfectly balanced dendrograms have a normalized tree balance equal to 1, while binary dendrograms formed chaining one new element at a time have a normalized tree balance equal to 0.

To calculate the cophenetic correlation coefficient, the cor() function in the stats package needs that the matrix of ultrametric distances (also known as cophenetic distances) and the matrix of proximity data used to build the corresponding dendrogram, they both have their rows and columns sorted in the same order. When the cophenetic() function is used with objects of class "hclust", it returns ultrametric matrices sorted in appropriate order. However, when the cophenetic() function is used with objects of class "dendrogram", it returns ultrametric matrices sorted in the order of dendrogram leaves. The ultrametric() function in this package returns ultrametric matrices in appropriate order to calculate the cophenetic correlation coefficient using the cor() function.

The space distortion ratio of a dendrogram is computed by the sdr() function as the difference between the maximum and minimum ultrametric distances, divided by the difference between the maximum and minimum original distances used to build the dendrogram. Space dilation occurs when the space distortion ratio is greater than 1.

See Also

linkage() in this package, hclust() in the stats package, and agnes() in the cluster package for building hierarchical trees.

Examples

Run this code
# NOT RUN {
## distances between 21 cities in Europe
data(eurodist)

## comparison of dendrograms in terms of the following descriptive mesures:
## - normalized tree balance
## - cophenetic correlation coefficient
## - normalized mean absolute error
## - space distortion ratio

## single linkage (call to the mdendro package)
dendro1 <- linkage(eurodist, method="single")
ntb(dendro1)          # 0.2500664
ultr1 <- ultrametric(dendro1)
cor(eurodist, ultr1)  # 0.7842797
mae(eurodist, ultr1)  # 0.6352011
sdr(eurodist, ultr1)  # 0.150663

## complete linkage (call to the stats package)
dendro2 <- as.dendrogram(hclust(eurodist, method="complete"))
ntb(dendro2)          # 0.8112646
ultr2 <- ultrametric(dendro2)
cor(eurodist, ultr2)  # 0.735041
mae(eurodist, ultr2)  # 0.8469728
sdr(eurodist, ultr2)  # 1

## unweighted arithmetic linkage (UPGMA)
dendro3 <- linkage(eurodist, method="arithmetic", weighted=FALSE)
ntb(dendro3)          # 0.802202
ultr3 <- ultrametric(dendro3)
cor(eurodist, ultr3)  # 0.7279432
mae(eurodist, ultr3)  # 0.294578
sdr(eurodist, ultr3)  # 0.5066903

## unweighted geometric linkage
dendro4 <- linkage(eurodist, method="geometric", weighted=FALSE)
ntb(dendro4)          # 0.7531278
ultr4 <- ultrametric(dendro4)
cor(eurodist, ultr4)  # 0.7419569
mae(eurodist, ultr4)  # 0.2891692
sdr(eurodist, ultr4)  # 0.4548112

# }

Run the code above in your browser using DataLab