glmQLFit(y, design=NULL, dispersion=NULL, abundance.trend=TRUE, robust=FALSE, winsor.tail.p=c(0.05, 0.1), ...)
glmQLFTest(glmfit, coef=ncol(glmfit$design), contrast=NULL)
DGEList
object containing count and sample data.robust=TRUE
.glmFit
.DGEGLM
object, usually output from qlmQLFit
.glmQLFit
produces an object of class DGEGLM
with the same components as that produced by glmFit
, plus:
df.prior
, the prior degrees of fredom; and var.prior
, the location of the prior distribution. Both are numeric vectors if abundance.trend=TRUE
and scalars otherwise. var.post
is a numeric vector containing the shrunk quasi-likelihood dispersion for each gene.s2.fit
.glmQFTest
produces objects of class DGELRT
with the same components as for glmfit
plus the following:
y
containing the log2-fold changes, F-statistics and p-values, ready to be displayed by topTags.
.table
contains the following columns:
y
.glmQLFTest
implements the quasi-likelihood method of Lund et al (2012).
It behaves the same as glmLRT
except that it replaces likelihood ratio tests with quasi-likelihood F-tests for coefficients in the linear model.
This function calls the limma function squeezeVar
to conduct empirical Bayes smoothing of the genewise multiplicative dispersions.
Note that the QuasiSeq
package provides a alternative implementation of Lund et al (2012), with slightly different glm, trend and FDR methods.There are a number of subtleties involved in the use of QL models. The first is that the negative binomial dispersions must be trended or common values. This is because the function assumes that the supplied values are the true values. For the trended/common values, the assumption is reasonable as information from many genes improves precision. This is not the case for the tagwise dispersions due to the limited information for each gene.
Another subtlety involves the handling of zero counts. Observations with fitted values of zero provide no residual degrees of freedom. This must be considered when computing the value of the quasi-likelihood dispersion for genes with many zeros. Finally, a lower bound is defined for the p-value of each gene, based on the likelihood ratio test. This avoids spurious results involving weak shrinkage with very low quasi-likelihood dispersions.
topTags
displays results from glmQLFTest
.plotQLDisp
can be used to visualize the distribution of QL dispersions after EB shrinkage from glmQLFit
.
The QuasiSeq
package gives an alternative implementation of glmQLFTest
based on the same statistical ideas.
nlibs <- 4
ntags <- 1000
dispersion.true <- 1/rchisq(ntags, df=10)
design <- model.matrix(~factor(c(1,1,2,2)))
# Generate count data
y <- rnbinom(ntags*nlibs,mu=20,size=1/dispersion.true)
y <- matrix(y,ntags,nlibs)
d <- DGEList(y)
d <- calcNormFactors(d)
# Fit the NB GLMs with QL methods
d <- estimateDisp(d, design)
fit <- glmQLFit(d, design)
results <- glmQLFTest(fit)
topTags(results)
fit <- glmQLFit(d, design, robust=TRUE)
results <- glmQLFTest(fit)
topTags(results)
fit <- glmQLFit(d, design, abundance.trend=FALSE)
results <- glmQLFTest(fit)
topTags(results)
Run the code above in your browser using DataLab