estimateDisp: Estimate Common, Trended and Tagwise Negative Binomial dispersions by weighted likelihood empirical Bayes

Description

Maximizes the negative binomial likelihood to give the estimate of the common, trended and tagwise dispersions across all tags.

Usage

estimateDisp(y, design=NULL, prior.df=NULL, trend.method="locfit", span=NULL,  min.row.sum=5, grid.length=21, grid.range=c(-10,10), robust=FALSE, winsor.tail.p=c(0.05,0.1), tol=1e-06)

Arguments

DGEList object

design

numeric design matrix

prior.df

prior degrees of freedom. It is used in calculating prior.n.

trend.method

method for estimating dispersion trend. Possible values are "none", "movingave", "loess" and "locfit".

span

width of the smoothing window, as a proportion of the data set.

min.row.sum

numeric scalar giving a value for the filtering out of low abundance tags. Only tags with total sum of counts above this value are used. Low abundance tags can adversely affect the dispersion estimation, so this argument allows the user to select an appropriate filter threshold for the tag abundance.

grid.length

the number of points on which the interpolation is applied for each tag.

grid.range

the range of the grid points around the trend on a log2 scale.

robust

logical, should the estimation of prior.df be robustified against outliers?

winsor.tail.p

numeric vector of length 1 or 2, giving left and right tail proportions of the deviances to Winsorize when estimating prior.df.

tol

the desired accuracy, passed to optimize

Value

common.dispersion: estimate of the common dispersion.
trended.dispersion: estimates of the trended dispersions.
tagwise.dispersion: tag- or gene-wise estimates of the dispersion parameter.
logCPM: the tag abundance in log average counts per million.
prior.df: prior degrees of freedom. It is a vector when robust method is used.
prior.n: estimate of the prior weight, i.e. the smoothing parameter that indicates the weight to put on the common likelihood compared to the individual tag's likelihood.
span: width of the smoothing window used in estimating dispersions.

Details

This function calculates a matrix of likelihoods for each gene at a set of dispersion grid points, and then applies weighted likelihood empirical Bayes method to obtain posterior dispersion estimates. If there is no design matrix, it calculates the quantile conditional likelihood for each gene (tag) and then maximize it. The method is same as in the function estimateCommonDisp and estimateTagwiseDisp. If a design matrix is given, it then calculates the adjusted profile log-likelihood for each gene (tag) and then maximize it. It is similar to the functions estimateGLMCommonDisp, estimateGLMTrendedDisp and estimateGLMTagwiseDisp.

References

Chen, Y, Lun, ATL, and Smyth, GK (2014). Differential expression analysis of complex RNA-seq experiments using edgeR. In: Statistical Analysis of Next Generation Sequence Data, Somnath Datta and Daniel S Nettleton (eds), Springer, New York. http://www.statsci.org/smyth/pubs/edgeRChapterPreprint.pdf

Examples

Run this code

# True dispersion is 1/5=0.2
y <- matrix(rnbinom(1000, mu=10, size=5), ncol=4)
group <- c(1,1,2,2)
design <- model.matrix(~group)
d <- DGEList(counts=y, group=group)
d1 <- estimateDisp(d)
d2 <- estimateDisp(d, design)

Run the code above in your browser using DataLab