Learn R Programming

optpart (version 3.0-3)

gensilwidth: Generalized Silhouette Width

Description

Calculates mean cluster silhouette widths using a generalized mean.

Usage

gensilwidth(clust, dist, p=1)

Arguments

clust

an integer vector of cluster memberships or a classification object of class ‘clustering’

dist

an object of class ‘dist’

p

the scaling parameter of the analysis

Value

an object of class ‘silhouette’, a list with components

cluster

the assigned cluster for each sample unit

neighbor

the identity of the nearest neighbor cluster for each sample unit

sil_width

the silhouette width for each sample unit

Details

gensilwidth calculates mean cluster silhouette widths using a generalized mean. The scaling parameter can be set between \([-\infty,\infty]\) where values less than one emphasize connectivity, and values greater than one emphasize compactedness. Individual sample unit silhouette widths are still calculated as \(s _i = (b_i - a_i) / \max(b_i,a_i)\) where \(a_i\) is the mean dissimilarity of a sample unit to the cluster to which it is assigned, and \(b_i\) is the mean dissimilarity to the nearest neighbor cluster. Given \(s_i\) for all members of a cluster, the generalized mean is calculated as

$$\bar s = \left( {1\over n} \sum_{k=1}^n s_k^p \right)^{1/p}$$

Exceptions exist for specific values:

for p=0 $$s_i = \left( \prod_{k=1}^n s_k \right)^{1/n}$$

for p=\(-\infty\) $$s_i = \min_{k=1}^n s_k$$

for p=\(\infty\) $$s_i = \max_{k=1}^n s_k$$

\(p=-1\) = harmonic mean, \(p=0\) = geometric mean, and \(p=1\) = arithmetic mean.

References

Lengyel, A. and Z. Botta-Dukat. 2019. Silhouette width using generalized mean: A flexible method for assessing clustering efficiency. Ecology and Evolution https://doi.org/10.1002/ece3.5774

See Also

silhouette

Examples

Run this code
# NOT RUN {
data(shoshveg)
dis.bc <- dsvdis(shoshveg,'bray')
opt.5 <- optpart(5,dis.bc)
gensilwidth(opt.5,dis.bc)
# }

Run the code above in your browser using DataLab