Function to calculate dissimilarities between samples. Either Bray-Curtis dissimilarities and/or Generalized UniFrac dissimilarities are calculated.
sampDis(sampleData, compDisMat = NULL, type = "BrayCurtis", alpha = 1)List with sample dissimilarity matrices. A list is always outputted, even if only one matrix is calculated.
Data frame with the relative concentration of each compound (column) in every sample (row).
Compound dissimilarity matrix, as calculated by
compDis. If this is supplied, Generalized UniFrac
dissimilarities can be calculated.
Type of sample dissimilarities to be calculated. This is
Bray-Curtis dissimilarities, type = "BrayCurtis", and/or
Generalized UniFrac dissimilarities, type = "GenUniFrac".
Parameter used in calculations of Generalized UniFrac
dissimilarities. alpha can be set between 0 and 1.
With alpha = 0, equal weight is put on every
branch in the dendrogram. With alpha = 1, branches are
weighted by their abundance, and hence more emphasis is put on high
abundance branches. alpha = 0.5 strikes a balance between the two.
alpha 0.5 or 1 is recommended, with alpha = 1 as default.
See Chen et al. 2012 for details.
The function calculates a dissimilarity matrix for all the samples
in sampleData, for the given dissimilarity index/indices.
Bray-Curtis dissimilarities are calculated using only
the sampleData. This is the most commonly calculated dissimilarity
index used for phytochemical data (other types of dissimilarities are
easily calculated using the vegdist function in
the vegan package).
If a compound dissimilarity matrix, compDisMat, is supplied,
Generalized UniFrac dissimilarities can be calculated, which also
use the compound dissimilarity matrix for the sample dissimilarity
calculations. For the calculation of Generalized UniFrac
dissimilarities (Chen et al. 2012), the compound dissimilarity matrix is
transformed into a dendrogram using hierarchical clustering (with the
UPGMA method). Calculations of UniFrac dissimilarities quantifies the
fraction of the total branch length of the dendrogram that leads to
compounds present in either sample, but not both. The (weighted) Generalized
UniFrac dissimilarities implemented here additionally take compound
abundances into account. In this way, both the relative proportions of
compounds and the biosynthetic/structural dissimilarities of the compounds
are accounted for in the calculations of sample dissimilarities, such that
two samples containing more biosynthetically/structurally different
compounds have a higher pairwise dissimilarity than two samples
containing more biosynthetically/structurally similar compounds.
As with Bray-Curtis dissimilarities, Generalized UniFrac dissimilarities
range in value from 0 to 1.
Bray JR, Curtis JT. 1957. An Ordination of the Upland Forest Communities of Southern Wisconsin. Ecological Monographs 27: 325-349.
Chen J, Bittinger K, Charlson ES, et al. 2012. Associating microbiome composition with environmental covariates using generalized UniFrac distances. Bioinformatics 28: 2106-2113.
Lozupone C, Knight R. 2005. UniFrac: a New Phylogenetic Method for Comparing Microbial Communities. Applied and Environmental Microbiology 71: 8228-8235.
data(minimalSampData)
data(minimalCompDis)
sampDis(minimalSampData)
sampDis(sampleData = minimalSampData, compDisMat = minimalCompDis,
type = c("BrayCurtis", "GenUniFrac"), alpha = 0.5)
data(alpinaSampData)
data(alpinaCompDis)
sampDis(sampleData = alpinaSampData, compDisMat = alpinaCompDis,
type = "GenUniFrac")
Run the code above in your browser using DataLab