aovBioCond
adopts the modeling strategy implemented in limma
(see "References"), except that each interval has its own prior variance,
which is read from the mean-variance curve associated with the
bioCond
objects. Technically, this function calculates a
moderated F statistic for each genomic interval to test the null
hypothesis. The moderated F statistic is simply the F
statistic from an ordinary one-way
ANOVA with its denominator (i.e., sample variance) replaced
by posterior variance, which is defined to be a weighted average of sample
and prior variances, with the weights being proportional to their respective
numbers of degrees of freedom.
This method of incorporating the prior information
increases the statistical power for the tests.
Two extreme values can be specified for the argument df.prior
(number of degrees of freedom associated with the prior variances),
representing two distinct
cases: when it's set to 0
, the prior information won't be used at
all, and the tests reduce to ordinary F tests in one-way ANOVA; when it's
set to Inf
, the denominators of moderated F statistics are simply the
prior variances, and these F statistics reduce to following a scaled
chi-squared distribution. Other values of df.prior
represent
intermediate cases. To be noted, the number of prior degrees of freedom is
automatically estimated for each
mean-variance curve by a specifically designed statistical method
(see also fitMeanVarCurve
and
setMeanVarCurve
) and, by default, aovBioCond
uses the
estimation result to perform the tests. It's highly not recommended
to specify df.prior
explicitly when calling aovBioCond
, unless
you know what you are really doing. Besides, aovBioCond
won't adjust
variance ratio factors of the provided bioCond
s based on the
specified number of prior degrees of freedom (see
estimatePriorDf
for a description of variance ratio factor).
Note also that, if df.prior
is set to 0
, of the
bioCond
objects in conds
there must be at least one that
contains two or more ChIP-seq
samples. Otherwise, there is no way to measure the variance associated with
each interval, and an error is raised.
Considering the practical significance of this analysis, which is to select
genomic intervals with differential ChIP-seq signals between at least one
pair of the biological conditions, those intervals not occupied by any of
the bioCond
objects in conds
may be filtered out before making the selections.
Thus, the statistical power of the tests could potentially be improved by
re-adjusting p-values of the remaining intervals.