Learn R Programming

OmicFlow (version 1.5.0)

cosine: Compute Cosine Dissimilarity from a Dense or Sparse Matrix.

Description

Calculates the cosine disimilarity of a Marix pairwise for each column.

Usage

cosine(x, weighted = TRUE, threads = 1)

Value

A column x column dist object.

Arguments

x

A matrix, sparseMatrix or Matrix.

weighted

A boolean value, to use abundances (weighted = TRUE) or absence/presence (weighted=FALSE) (default: TRUE).

threads

A wholenumber, the number of threads to use in setThreadOptions (default: 1).

Details

The cosine dissimilarity between two samples \(A\) and \(B\), each of length \(n\), is defined as:

\(d(A,B) = 1 - \frac{\sum_{i}^n A_i B_i}{\sqrt{\sum_{i}^n A_i^2} \sqrt{\sum_{i}^n B_i^2}} \)

where \(A_i\) and \(B_i\) are the abundances of the \(i\)-th feature in sample \(A\) and \(B\), respectively. When weighted is set to FALSE, counts are replaced by presence/absence data.

References

Deza, M. M., & Deza, E. (2009). Encyclopedia of Distances. Springer Science & Business Media., 308.

Examples

Run this code
library("OmicFlow")

metadata_file <- system.file("extdata", "metadata.tsv", package = "OmicFlow")
counts_file <- system.file("extdata", "counts.tsv", package = "OmicFlow")
features_file <- system.file("extdata", "features.tsv", package = "OmicFlow")
tree_file <- system.file("extdata", "tree.newick", package = "OmicFlow")

taxa <- metagenomics$new(
    metaData = metadata_file,
    countData = counts_file,
    featureData = features_file,
    treeData = tree_file
)

taxa$feature_subset(Kingdom == "Bacteria")
taxa$normalize()

cosine(taxa$countData)

Run the code above in your browser using DataLab