edgeR (version 3.14.0)

cpm: Counts per Million or Reads per Kilobase per Million

Description

Computes counts per million (CPM) or reads per kilobase per million (RPKM) values.

Usage

"cpm"(x, normalized.lib.sizes=TRUE, log=FALSE, prior.count=0.25, ...) "cpm"(x, lib.size=NULL, log=FALSE, prior.count=0.25, ...) "rpkm"(x, gene.length=NULL, normalized.lib.sizes=TRUE, log=FALSE, prior.count=0.25, ...) "rpkm"(x, gene.length, lib.size=NULL, log=FALSE, prior.count=0.25, ...)

Arguments

x
matrix of counts or a DGEList object
normalized.lib.sizes
logical, use normalized library sizes?
lib.size
library size, defaults to colSums(x).
log
logical, if TRUE then log2 values are returned.
prior.count
average count to be added to each observation to avoid taking log of zero. Used only if log=TRUE.
gene.length
vector of length nrow(x) giving gene length in bases, or the name of the column x$genes containing the gene lengths.
...
other arguments that are not currently used.

Value

Details

CPM or RPKM values are useful descriptive measures for the expression level of a gene. By default, the normalized library sizes are used in the computation for DGEList objects but simple column sums for matrices.

If log-values are computed, then a small count, given by prior.count but scaled to be proportional to the library size, is added to x to avoid taking the log of zero.

The rpkm method for DGEList objects will try to find the gene lengths in a column of x$genes called Length or length. Failing that, it will look for any column name containing "length" in any capitalization.

See Also

aveLogCPM

Examples

Run this code
y <- matrix(rnbinom(20,size=1,mu=10),5,4)
cpm(y)

d <- DGEList(counts=y, lib.size=1001:1004)
cpm(d)
cpm(d,log=TRUE)

d$genes$Length <- c(1000,2000,500,1500,3000)
rpkm(d)

Run the code above in your browser using DataCamp Workspace