calcNormFactors

0th

Percentile

Calculate Normalization Factors to Align Columns of a Count Matrix

Calculate normalization factors to scale the raw library sizes.

Usage
"calcNormFactors"(object, method=c("TMM","RLE","upperquartile","none"), refColumn=NULL, logratioTrim=.3, sumTrim=0.05, doWeighting=TRUE, Acutoff=-1e10, p=0.75, ...) "calcNormFactors"(object, lib.size=NULL, method=c("TMM","RLE", "upperquartile","none"), refColumn=NULL, logratioTrim=.3, sumTrim=0.05, doWeighting=TRUE, Acutoff=-1e10, p=0.75, ...)
Arguments
object
either a matrix of raw (read) counts or a DGEList object
lib.size
numeric vector of library sizes of the object.
method
normalization method to be used
refColumn
column to use as reference for method="TMM". Can be a column number or a numeric vector of length nrow(object).
logratioTrim
amount of trim to use on log-ratios ("M" values) for method="TMM"
sumTrim
amount of trim to use on the combined absolute levels ("A" values) for method="TMM"
doWeighting
logical, whether to compute (asymptotic binomial precision) weights for method="TMM"
Acutoff
cutoff on "A" values to use before trimming for method="TMM"
p
percentile (between 0 and 1) of the counts that is aligned when method="upperquartile"
...
further arguments that are not currently used.
Details

method="TMM" is the weighted trimmed mean of M-values (to the reference) proposed by Robinson and Oshlack (2010), where the weights are from the delta method on Binomial data. If refColumn is unspecified, the library whose upper quartile is closest to the mean upper quartile is used.

method="RLE" is the scaling factor method proposed by Anders and Huber (2010). We call it "relative log expression", as median library is calculated from the geometric mean of all columns and the median ratio of each sample to the median library is taken as the scale factor.

method="upperquartile" is the upper-quartile normalization method of Bullard et al (2010), in which the scale factors are calculated from the 75% quantile of the counts for each library, after removing genes which are zero in all libraries. This idea is generalized here to allow scaling by any quantile of the distributions.

If method="none", then the normalization factors are set to 1.

For symmetry, normalization factors are adjusted to multiply to 1. The effective library size is then the original library size multiplied by the scaling factor.

Note that rows that have zero counts for all columns are trimmed before normalization factors are computed. Therefore rows with all zero counts do not affect the estimated factors.

Value

If object is a matrix, the output is a vector with length ncol(object) giving the relative normalization factors. If object is a DGEList, then it is returned as output with the relative normalization factors in object$samples$norm.factors.

References

Anders, S, Huber, W (2010). Differential expression analysis for sequence count data Genome Biology 11, R106.

Bullard JH, Purdom E, Hansen KD, Dudoit S. (2010) Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments. BMC Bioinformatics 11, 94.

Robinson MD, Oshlack A (2010). A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biology 11, R25.

Aliases
  • calcNormFactors
  • calcNormFactors.DGEList
  • calcNormFactors.default
Examples
y <- matrix( rpois(1000, lambda=5), nrow=200 )
calcNormFactors(y)
Documentation reproduced from package edgeR, version 3.14.0, License: GPL (>=2)

Community examples

Looks like there are no examples yet.