Last chance! 50% off unlimited learning
Sale ends in
estimateSizeFactors
).
fpkm(object, robust = TRUE)
DESeqDataSet
fpm
function.GRangesList
or GRanges
of the mcols(object), and per million of mapped fragments,
either using the robust median ratio method (robust=TRUE, default)
or using raw counts (robust=FALSE).
Defining a column mcols(object)$basepairs
takes
precedence over internal calculation of the kilobases for each row.
assays(dds)
,
this will take precedence in the length normalization.
This occurs when using the tximport-DESeq2 pipeline.
(2) Otherwise, feature length is calculated
from the rowRanges
of the dds object,
if a column basepairs
is not present in mcols(dds)
.
The calculated length is the number of basepairs in the union of all GRanges
assigned to a given row of object
, e.g.,
the union of all basepairs of exons of a given gene.
Note that the second approach over-estimates the gene length
(average transcript length, weighted by abundance is a more appropriate
normalization for gene counts), and so the FPKM will be an underestimate of the true value.Note that, when the read/fragment counting has inter-feature dependencies, a strict normalization would not incorporate the basepairs of a feature which overlap another feature. This inter-feature dependence is not taken into consideration in the internal union basepair calculation.
fpm
# create a matrix with 1 million counts for the
# 2nd and 3rd column, the 1st and 4th have
# half and double the counts, respectively.
m <- matrix(1e6 * rep(c(.125, .25, .25, .5), each=4),
ncol=4, dimnames=list(1:4,1:4))
mode(m) <- "integer"
se <- SummarizedExperiment(list(counts=m), colData=DataFrame(sample=1:4))
dds <- DESeqDataSet(se, ~ 1)
# create 4 GRanges with lengths: 1, 1, 2, 2.5 Kb
gr1 <- GRanges("chr1",IRanges(1,1000)) # 1kb
gr2 <- GRanges("chr1",IRanges(c(1,1001),c( 500,1500))) # 1kb
gr3 <- GRanges("chr1",IRanges(c(1,1001),c(1000,2000))) # 2kb
gr4 <- GRanges("chr1",IRanges(c(1,1001),c(200,1300))) # 500bp
rowRanges(dds) <- GRangesList(gr1,gr2,gr3,gr4)
# the raw counts
counts(dds)
# the FPM values
fpm(dds)
# the FPKM values
fpkm(dds)
Run the code above in your browser using DataLab