normalizeTagCount(object, method = "powerLaw", fitInRange = c(10, 1000), alpha = 1.25, T = 10^6)CAGEset object
"simpleTpm" to convert tag counts to tags per million or "powerLaw" to normalize to a referent power-law distribution, or "none" to keep using the raw tag counts in downstream analyses.
method = "powerLaw", otherwise ignored. See Details.
-1 * alpha will be the slope of the referent power-law distribution in the log-log representation. Used only when method = "powerLaw", otherwise ignored. See Details.
T = 10^6 results in normalized values that correspond to tags per million in the referent distribution. Used only when method = "powerLaw", otherwise ignored. See Details.
normalizedTpmMatrix of the provided CAGEset object will be occupied by normalized CAGE signal values per CTSS across all experiments, or with the raw tag counts (in case method = "none").
y = -1 * alpha * x + beta, which is fully determined by the slope alpha and total number of tags T (which together with alpha determines the value of beta). Thus, by specifying parameters alpha and T a desired referent power-law distribution can be selected. However, real CAGE datasets deviate from the power-law in the areas of very low and very high number of tags, so it is advisable to discard these areas before fitting a power-law distribution. fitInRange parameter allows to specify a range of values (lower and upper limit of the number of CAGE tags) that will be used to fit a power-law. Plotting reverse cumulatives using plotReverseCumulatives function can help in choosing the best range of values. After fitting a power-law distribution to each CAGE dataset individually, all datasets are normalized to a referent distribution specified by alpha and T. When T = 10^6, normalized values are expressed as tags per million (tpm).
plotReverseCumulatives
load(system.file("data", "exampleCAGEset.RData", package="CAGEr"))
normalizeTagCount(exampleCAGEset, method = "powerLaw")
Run the code above in your browser using DataLab