normalizeTagCount(object, method = "powerLaw", fitInRange = c(10, 1000), alpha = 1.25, T = 10^6)
CAGEset
object
"simpleTpm"
to convert tag counts to tags per million or "powerLaw"
to normalize to a referent power-law distribution, or "none"
to keep using the raw tag counts in downstream analyses.
method = "powerLaw"
, otherwise ignored. See Details.
-1 * alpha
will be the slope of the referent power-law distribution in the log-log representation. Used only when method = "powerLaw"
, otherwise ignored. See Details.
T = 10^6
results in normalized values that correspond to tags per million in the referent distribution. Used only when method = "powerLaw"
, otherwise ignored. See Details.
normalizedTpmMatrix
of the provided CAGEset
object will be occupied by normalized CAGE signal values per CTSS across all experiments, or with the raw tag counts (in case method = "none"
).
y = -1 * alpha * x + beta
, which is fully determined by the slope alpha
and total number of tags T
(which together with alpha
determines the value of beta
). Thus, by specifying parameters alpha
and T
a desired referent power-law distribution can be selected. However, real CAGE datasets deviate from the power-law in the areas of very low and very high number of tags, so it is advisable to discard these areas before fitting a power-law distribution. fitInRange
parameter allows to specify a range of values (lower and upper limit of the number of CAGE tags) that will be used to fit a power-law. Plotting reverse cumulatives using plotReverseCumulatives
function can help in choosing the best range of values. After fitting a power-law distribution to each CAGE dataset individually, all datasets are normalized to a referent distribution specified by alpha
and T
. When T = 10^6
, normalized values are expressed as tags per million (tpm).
plotReverseCumulatives
load(system.file("data", "exampleCAGEset.RData", package="CAGEr"))
normalizeTagCount(exampleCAGEset, method = "powerLaw")
Run the code above in your browser using DataLab