Distance / dissimilarity between samples.
bdiv_table(
biom,
bdiv = "bray",
weighted = NULL,
normalized = NULL,
tree = NULL,
md = ".all",
within = NULL,
between = NULL,
delta = ".all",
norm = "none",
pseudocount = NULL,
power = 1.5,
alpha = 0.5,
transform = "none",
ties = "random",
seed = 0,
cpus = n_cpus(),
...
)bdiv_matrix(
biom,
bdiv = "bray",
weighted = NULL,
normalized = NULL,
tree = NULL,
within = NULL,
between = NULL,
norm = "none",
pseudocount = NULL,
power = 1.5,
alpha = 0.5,
transform = "none",
ties = "random",
seed = 0,
cpus = n_cpus()
)
bdiv_distmat(
biom,
bdiv = "bray",
weighted = NULL,
normalized = NULL,
tree = NULL,
within = NULL,
between = NULL,
norm = "none",
pseudocount = NULL,
power = 1.5,
alpha = 0.5,
transform = "none",
ties = "random",
seed = 0,
cpus = n_cpus()
)
bdiv_matrix() - An R matrix of samples x samples.
bdiv_distmat() - A dist-class distance matrix.
bdiv_table() - A tibble data.frame with columns named .sample1, .sample2,
.bdiv, .distance, and any fields requested by md. Numeric metadata
fields will be returned as abs(x - y); categorical metadata fields as
"x", "y", or "x vs y".
An rbiom object, or any value accepted by as_rbiom().
Beta diversity distance algorithm(s) to use. Options are:
c("aitchison", "bhattacharyya", "bray", "canberra", "chebyshev", "chord", "clark", "sorensen", "divergence", "euclidean", "generalized_unifrac", "gower", "hamming", "hellinger", "horn", "jaccard", "jensen", "jsd", "lorentzian", "manhattan", "matusita", "minkowski", "morisita", "motyka", "normalized_unifrac", "ochiai", "psym_chisq", "soergel", "squared_chisq", "squared_chord", "squared_euclidean", "topsoe", "unweighted_unifrac", "variance_adjusted_unifrac", "wave_hedges", "weighted_unifrac").
For the UniFrac family, a phylogenetic tree must be present in biom
or explicitly provided via tree=. Supports partial matching.
Multiple values are allowed for functions which return a table or
plot. Default: "bray"
(Deprecated - weighting is now inherent in bdiv metric name.)
Take relative abundances into account. When weighted=FALSE, only
presence/absence is considered. Multiple values allowed. Default: NULL
(Deprecated - normalization is now inherent in bdiv metric
name.) Only changes the "Weighted UniFrac" calculation. Divides result by
the total branch weights. Default: NULL
A phylo object representing the phylogenetic
relationships of the taxa in biom. Only required when
computing UniFrac distances. Default: biom$tree
Dataset field(s) to include in the output data frame, or '.all'
to include all metadata fields. Default: '.all'
Dataset field(s) for intra- or inter- sample
comparisons. Alternatively, dataset field names given elsewhere can
be prefixed with '==' or '!=' to assign them to within or
between, respectively. Default: NULL
For numeric metadata, report the absolute difference in values
for the two samples, for instance 2 instead of "10 vs 12".
Default: TRUE
Normalize the incoming counts. Options are:
'none': No transformation.
'percent': Relative abundance (sample abundances sum to 1).
'binary': Unweighted presence/absence (each count is either 0 or 1).
'clr': Centered log ratio.
Default: 'none'.
Value added to counts to handle zeros when
norm = 'clr'. Ignored for other normalization methods.
Default: NULL (emits a warning).
Scaling factor for the magnitude of differences between
communities (\(p\)) when bdiv = 'minkowski'. Ignored for other
beta diversity metrics. Default: 1.5
The alpha term to use in Generalized UniFrac. How much weight
to give to relative abundances; a value between 0 and 1, inclusive.
Setting alpha=1 is equivalent to Normalized UniFrac. Default: 0.5
Transformation to apply to calculated values. Options are:
c("none", "rank", "log", "log1p", "sqrt", "percent"). "rank" is
useful for correcting for non-normally distributions before applying
regression statistics. Default: "none"
When transform="rank", how to rank identical values.
Options are: c("average", "first", "last", "random", "max", "min").
See rank() for details. Default: "random"
Random seed for permutations. Must be a non-negative integer.
Default: 0
The number of CPUs to use. Set to NULL to use all available,
or to 1 to disable parallel processing. Default: NULL
Not used.
Prefix metadata fields with == or != to limit comparisons to within or
between groups, respectively. For example, stat.by = '==Sex' will
run calculations only for intra-group comparisons, returning "Male" and
"Female", but NOT "Female vs Male". Similarly, setting
stat.by = '!=Body Site' will only show the inter-group comparisons, such
as "Saliva vs Stool", "Anterior nares vs Buccal mucosa", and so on.
The same effect can be achieved by using the within and between
parameters. stat.by = '==Sex' is equivalent to
stat.by = 'Sex', within = 'Sex'.
Other beta_diversity:
bdiv_boxplot(),
bdiv_clusters(),
bdiv_corrplot(),
bdiv_heatmap(),
bdiv_ord_plot(),
bdiv_ord_table(),
bdiv_stats(),
distmat_stats()
library(rbiom)
# Subset to four samples
biom <- hmp50$clone()
biom$counts <- biom$counts[,c("HMP18", "HMP19", "HMP20", "HMP21")]
# Return in long format with metadata
bdiv_table(biom, 'w_unifrac', md = ".all")
# Only look at distances among the stool samples
bdiv_table(biom, 'w_unifrac', md = c("==Body Site", "Sex"))
# Or between males and females
bdiv_table(biom, 'w_unifrac', md = c("Body Site", "!=Sex"))
# All-vs-all matrix
bdiv_matrix(biom, 'w_unifrac')
# All-vs-all distance matrix
dm <- bdiv_distmat(biom, 'w_unifrac')
dm
plot(hclust(dm))
Run the code above in your browser using DataLab