aggregateTagClusters(object, tpmThreshold = 5,
excludeSignalBelowThreshold = TRUE,
qLow = NULL, qUp = NULL, maxDist = 100)
CAGEset
object
>= tpmThreshold
will be used to construct consensus clusters.
TRUE
only tag clusteres with normalized signal >= tpmThreshold
will contribute to the total CAGE signal of a consensus cluster, i.e. only the TCs that are used to construct consensus cluster. When set to FALSE
all TCs that overlap consensus cluster will contribute to the total signal (regardless whether they pass the threshold or not), however only the TCs above the threshold will be used to define consensus cluster boundaries. Thus, it that case the TCs above the threshold are first used to construct consensus clusters and define their boundaries, but then CAGE signal from all TCs that fall within those boundaries is used to calculate total signal of a particular consensus cluster.
qLow = NULL
start position of the TC is used. See Details.
qUp = NULL
end position of the TC is used. qUp
has to be >= qLow
. See Details.
consensusClusters
, tagClustersInConsensusClusters
and consensusClustersTpmMatrix
of the provided CAGEset
object will be occupied by the genomic coordinates of consensus clusters, information on containing TCs and the total CAGE signal across all CAGE datasets, respectively.
clusterCTSS
function are constructed for every CAGE dataset within CAGEset object separatelly, based on the CAGE signal in that sample. Thus, TCs from two CAGE datasets can differ both in their number, genomic coordinates, position of dominant TSS and overall signal. To be able to compare all samples at the level of clusters of TSSs, TCs from all CAGE datasets are aggregated into a single set of consensus clusters. First, TCs with signal >= tpmThreshold
from all CAGE datasets are selected, and their 5' and 3' boundaries are determined based on provided qLow
and qUp
parameters. If qLow = NULL
and qUp = NULL
the start and end coordinates, i.e. the full span of the TC is used, otherwise the positions of qLow
and qUp
quantiles are used as 5' and 3' boundary, respectively. Finally, the defined set of TCs from all CAGE datasets is reduced to a non-overlapping set of consensus clusters by merging overlapping TCs and TCs <= maxdist<="" code=""> base-pairs apart. Consensus clusters represent a referent set of promoters that can be further used for expression profiling or detecting "shifting" (differentially used) promoters between different CAGE samples.
=>
clusterCTSS
load(system.file("data", "exampleCAGEset.RData", package="CAGEr"))
aggregateTagClusters(object = exampleCAGEset, tpmThreshold = 50,
excludeSignalBelowThreshold = FALSE, qLow = 0.1, qUp = 0.9, maxDist = 100)
Run the code above in your browser using DataLab