testDiversity performs pairwise significance tests of the diversity index
($D$) at a given diversity order ($q$) for a set of annotation groups via
rarefaction and bootstrapping.testDiversity(data, q, group, clone = "CLONE", copy = NULL, min_n = 30,
max_n = NULL, nboot = 2000)data column containing group identifiers.data column containing clone identifiers.data column containing copy numbers for each
sequence. If copy=NULL (the default), then clone abundance
is determined by the number of sequences. If a copy column
is specified, then clone abundances is dNULL the maximum
if automatically determined from the size of the largest group.DiversityTest object containing p-values and summary statistics.calcDiversity for further details.Diversity is calculated on the estimated complete clonal abundance distribution. This distribution is inferred by using the Chao1 estimator to estimate the number of seen clones, and applying the relative abundance correction and unseen clone frequency described in Chao et al, 2014.
Variability in total sequence counts across unique values in the group column is
corrected by repeated resampling from the estimated complete clonal distribution to
a common number of sequences. The diversity index estimate ($D$) for each group is
the mean value of over all bootstrap realizations.
Significance of the difference in diversity index ($D$) between groups is tested by
constructing a bootstrap delta distribution for each pair of unique values in the
group column. The bootstrap delta distribution is built by subtracting the diversity
index $Da$ in $group-a$ from the corresponding value $Db$ in $group-b$,
for all bootstrap realizations, yeilding a distribution of nboot total deltas; where
$group-a$ is the group with the greater mean $D$. The p-value for hypothesis
$Da != Db$ is the value of $P(0)$ from the empirical cumulative distribution
function of the bootstrap delta distribution, multiplied by 2 for the two-tailed correction.
calcDiversity for the basic calculation and
DiversityTest for the return object.
See rarefyDiversity for curve generation.
See ecdf for computation of the empirical cumulative
distribution function.# Load example data
file <- system.file("extdata", "ExampleDb.gz", package="alakazam")
df <- readChangeoDb(file)
# Groups under the size threshold are excluded and a warning message is issued.
testDiversity(df, "SAMPLE", q=0, min_n=30, nboot=100)Run the code above in your browser using DataLab