canberra: Compute Canberra Dissimilarity from a from a Dense or Sparse Matrix.

Description

Calculates the Canberra dissimilarity of a Matrix pairwise for each column.

Usage

canberra(x, weighted = TRUE, threads = 1)

Value

A column x column dist object.

Arguments

x: A matrix, sparseMatrix or Matrix.
weighted: A boolean value, to use abundances (weighted = TRUE) or absence/presence (weighted=FALSE) (default: TRUE).
threads: A wholenumber, the number of threads to use in setThreadOptions (default: 1).

Details

The Canberra dissimilarity between two samples \(A\) and \(B\), each of length \(n\), is defined as:

\(d(A,B) = \frac{1 / NZ} \sum_{i}^n \frac{|A_i - B_i|}{|A_i| + |B_i|}\)

where \(A_i\) and \(B_i\) are the abundances of the \(i\)-th feature in sample \(A\) and \(B\), respectively. NZ are the number of non-zero entries. When weighted is set to FALSE, counts are replaced by presence/absence data.

References

Lance, G.N. & Williams, W.T. (1967) Mixed-data classificatory programs. I. Agglomerative systems. Australian Computer Journal, 1(1), 15-20.

Examples

Run this code

library("OmicFlow")

metadata_file <- system.file("extdata", "metadata.tsv", package = "OmicFlow")
counts_file <- system.file("extdata", "counts.tsv", package = "OmicFlow")
features_file <- system.file("extdata", "features.tsv", package = "OmicFlow")
tree_file <- system.file("extdata", "tree.newick", package = "OmicFlow")

taxa <- metagenomics$new(
    metaData = metadata_file,
    countData = counts_file,
    featureData = features_file,
    treeData = tree_file
)

taxa$feature_subset(Kingdom == "Bacteria")
taxa$normalize()

canberra(taxa$countData)

Run the code above in your browser using DataLab