Community: Manipulation of Community objects.

Description

Methods and functions documented on this page are for manipulating microbial community data represented as Community instances.

Usage

## Construction
Community(community = matrix(0L, 0L, 0L), samples = data.frame(),
    ..., verbose = TRUE)
## Access
samples(x, ...)
samples(x, ...) <- value
communities(x, ...)
communities(x, ...) <- value
## normalization
## S3 method for class 'Community':
normalize(x, byTaxon = "none", bySample = c("none", "median"),
    transform = c("none", "asinh"))
## S3 method for class 'Community':
trimq(x, taxonQ = 0, sampleQ = 0, ...)
## descriptive statistics
diversityStatistics(x,
    statistics = c("Shannon", "Simpson", "Chao1", "ACE"))
occur(x, group, ..., threshold) 
pa(x, group, ...)
## model fit
glmFit(formula, x, ..., FUN = glm.nb)
glmSummary(fit, ...)

Arguments

community

An integer or numeric matrix of counts, with rows corresponding to taxa and columns to samples.

samples

A data.frame of sample metadata, with as many rows as there are columns in community.

An instance of a Community-class.

value

A data.frame or DataFrame instance (for samples<-) or matrix (for communities<-) of values to be used to update x.

byTaxon, bySample, transform

A character(1), chosen from the values specified in the function signature. Normalization (scale rows or columns to a common value) is by taxon (none, currently not implemented), or by sample (none, or scaled by column median). Post-normalization values can be asinh-transformed.

taxonQ, sampleQ

A numeric(1) quantile below which taxa or samples are excluded.

statistics

A character() of desired diversity statistics; one or more of the values specificed in the function signature.

group

For occur, pa: A column name, from samples(), indicating how counts are to be grouped by occur or pa.

threshold

a numeric(1) threshold above which occurrence is to be tabulated.

...

Additional arguments, not used by functions or methods described on this page.

formula

A formula describing the model to be fitted.

FUN

The function used to fit the model. See details for additional information.

fit

The result of glmFit.

verbose

logical(1) describing whether actions taken to make community and samples conform are reported.

Value

Community returns an instance of the Community class. communities returns a matrix of taxa x samples. samples returns a DataFrame of sample metadata.
normalize and trimq return a Community instance, adjusted to reflect the requested transformations.
diversityStatistics, occur, and pa return a data.frame of summary statistics, with rows corresponding to taxonomic groups.
glmFit returns a SimpleList of fitted models. glmSummary returns a data.frame of estimated coefficients and P-values.

Details

Community is used to create a data structure that allows convenient, coordinated manipulation of taxa and samples, including sample metadata. communities and samples provide access to the community matrix and sample metadata of an Community-class instance.

normalize provides ways to standardize the column sums or scale on which count data are represented.

trimq reduces the number of taxa or samples present in a community by including only those taxa or samples whose abundance exceeds the specified quantile.

diversityStatistics summarizes per-sample diversity statistics using implementations from the ape package.

occur and pa summarize and test for differences (using a chi-squared test) between the occurrence or presence / absence status of samples stratified by group, a column (factor) in the sample metadata.

glmFit fits a linear model (any model with a formula interface) to each taxonomic group.

glmSummary summarizes the fits returned by glmFit, providing a data frame of estimated coefficients and p-values for each fit. glmSummary currently supports models fit with glmFit and FUN with value MASS::glm.nb, pscl::hurdle, or pscl::zeroinfl.

Examples

Run this code

## Object construction
dirpath <- system.file(package="microbiome", "extdata")
communities <- read.csv(file.path(dirpath, "communities.csv"),
    row.names=1L)
samples <- read.csv(file.path(dirpath, "samples.csv"),
    row.names=1L)
(cc <- Community(as.matrix(communities), samples))
head(samples(cc))

## 10% most abundant (over all samples) taxa
(cc1 <- trimq(cc, taxonQ=.9))
rownames(cc1)

## phenotypic data; summarized; subset
names(samples(cc))
summary(samples(cc)$nugent_2_group)
cc[, samples(cc)$nugent_2_group == "Positive (>= 7)"]

## descriptive statistics
head(diversityStatistics(cc))
pa(cc1[1:6,], "nugent_2_group")