"estimateSCV"( object, method = c( "pooled", "per-condition", "blind" ), sharingMode = c( "maximum", "fit-only", "gene-est-only" ), fitType = c("local","parametric"), locfit_extra_args=list(), lp_extra_args=list(), ... )
XBSeqDataSet
with size factors.
pooled
- Use the samples from all conditions with
replicates to estimate a single pooled empirical dispersion value,
called "pooled", and assign it to all samples.
per-condition
- For each condition with replicates, compute
a gene's empirical dispersion value by considering the data from samples for this
condition. For samples of unreplicated conditions, the maximum
of empirical dispersion values from the other conditions is used.
blind
- Ignore the sample labels and compute a
gene's empirical dispersion value as if all samples were replicates of a
single condition. This can be done even if there are no biological
replicates. This method can lead to loss of power.
sharingMode
argument specifies which of these two values
will be written to the dispEst
and hence will be used by the
functions XBSeqTest
fit-only
- use only the fitted value, i.e., the
empirical value is used only as input to the fitting, and then
ignored. Use this only with very few replicates, and when
you are not too concerned about false positives from dispersion outliers, i.e. genes
with an unusually high variability.
maximum
- take the maximum of the two values. This is
the conservative or prudent choice, recommended once you have at
least three or four replicates and maybe even with only two replicates.
gene-est-only
- No fitting or sharing, use only the
empirical value. This method is preferable when the number of
replicates is large and the empirical dispersion values are
sufficiently reliable. If the number of replicates is small, this
option may lead to many cases where the dispersion
of a gene is accidentally underestimated and a false positive arises in
the subsequent testing.
parametric
- Fit a dispersion-mean relation of the
form dispersion = asymptDisp + extraPois / mean
via a robust
gamma-family GLM. The coefficients asymptDisp
and extraPois
are given in the attribute coefficients
of the dispFunc
in the fitInfo
.
local
- Use the locfit package to fit a dispersion-mean
relation, as described in the DESeq paper.
fitType=local
)
Options to be passed to the locfit
and to the lp
function of the locfit package. Use this to adjust the local
fitting. For example, you may pass a value for nn
different
from the default (0.7) if the fit seems too smooth or too rough by
setting lp_extra_agrs=list(nn=0.9)
. As another example, you
can set locfit_extra_args=list(maxk=200)
if you get the
error that locfit ran out of nodes. See the documentation of the
locfit
package for details. In most cases, you will not
need to provide these parameters, as the defaults seem to work
quite well.XBSeqDataSet
cds, with the slots fitInfo
and
dispEst
updated.
method="pooled". Otherwise,
try method="per-condition"
. We revised the code to estimate the variance of the true
signal by using variance sum law rather than calculate the variance directly.
=3),>
XBSeqDataSet
conditions <- factor(c(rep('C1', 3), rep('C2', 3)))
data(ExampleData)
XB <- XBSeqDataSet(Observed, Background, conditions)
XB <- estimateRealCount(XB)
XB <- estimateSizeFactors(XB)
XB <- estimateSCV(XB, fitType='local')
str(fitInfo(XB))
Run the code above in your browser using DataLab