unifrac: Compute UniFrac Dissimilarity from a Dense or Sparse Matrix.

Description

Calculates the UniFrac dissimilarity between samples based on phylogenetic branch lengths and abundance or presence/absence data.

Usage

unifrac(x, tree, weighted = TRUE, normalized = TRUE, threads = 1)

Value

A column x column dist object.

Arguments

x: A matrix, sparseMatrix or Matrix of strictly positive counts or presence/absence data.
tree: A phylo class tree.
weighted: A boolean value, to use abundances (weighted = TRUE) or absence/presence (weighted=FALSE) (default: TRUE).
normalized: A boolean value, whether to normalize weighted UniFrac distances to be between 0 and 1 (default: TRUE). Unweighted UniFrac is always normalized.
threads: A wholenumber, the number of threads to use in setThreadOptions (default: 1).

Details

The UniFrac distance between two samples \(A\) and \(B\), with phylogenetic tree edges \(i = 1 \ldots n\) of lengths \(L_i\), is computed differently depending on the weighted and normalized flags. When weighted = FALSE, input counts are first converted to presence/absence data.

Weighted UniFrac (normalized = FALSE and weighted = TRUE):: \(d(A,B) = \frac{\sum_{i}^n L_i |A_i - B_i|}{\sum_{i}^n L_i (A_i + B_i)}\)
Normalized Weighted UniFrac (normalized = TRUE and weighted = TRUE):: \(d(A,B) = \sum_{i}^n L_i |A_i - B_i|\)
Unweighted UniFrac (weighted = FALSE, unweighted is always normalized):: \(d(A,B) = \frac{\sum_{i}^n L_i |A_i - B_i|}{\sum_{i}^n L_i \max(A_i, B_i)}\)

References

Lozupone, C., & Knight, R. (2005). UniFrac: a new phylogenetic method for comparing microbial communities. Applied and Environmental Microbiology, 71(12), 8228–8235.

Examples

Run this code

library("OmicFlow")

metadata_file <- system.file("extdata", "metadata.tsv", package = "OmicFlow")
counts_file <- system.file("extdata", "counts.tsv", package = "OmicFlow")
features_file <- system.file("extdata", "features.tsv", package = "OmicFlow")
tree_file <- system.file("extdata", "tree.newick", package = "OmicFlow")

taxa <- metagenomics$new(
    metaData = metadata_file,
    countData = counts_file,
    featureData = features_file,
    treeData = tree_file
)

taxa$feature_subset(Kingdom == "Bacteria")
taxa$normalize()

# Weighted UniFrac
unifrac(x = taxa$countData, tree = taxa$treeData, weighted=TRUE, normalized=FALSE)

# Weighted Normalized UniFrac
unifrac(x = taxa$countData, tree = taxa$treeData, weighted=TRUE, normalized=TRUE)

# Unweighted UniFrac
unifrac(x = taxa$countData, tree = taxa$treeData, weighted=FALSE)

Run the code above in your browser using DataLab