breakTranscriptsByPeaks: breakTranscriptsByPeaks

Description

The function divides closely spaced transcripts into individually transcribed units using the detected active transcription start sites.

Usage

breakTranscriptsByPeaks(tdsObj, cdsObj, estimate.params = TRUE)
## S3 method for class 'TranscriptionDataSet,ChipDataSet':
breakTranscriptsByPeaks(tdsObj,
  cdsObj, estimate.params = TRUE)

Arguments

tdsObj

A TranscriptionDataSet object.

cdsObj

A ChipDataSet object.

estimate.params

Logical. Whether to estimate expression level and coverage density of the newly detected transcripts. Default: TRUE.

Value

The slot transcripts of the provided TranscriptionDataSet object will be updated by the GRanges object, containing transcripts and, if estimated, corresponding expression levels.

Details

One of the challenges for primary transcript detection concerns the simultaneous transcription of closely spaced genes, which needs to be properly divided into individually transcribed units. transcriptR combines RNA-seq data with ChIP-seq data of histone modifications that mark active Transcription Start Sites (TSSs), such as, H3K4me3 or H3K9/14Ac to overcome this challenge. The advantage of this approach over the use of, for example, gene annotations is that this approach is data driven and therefore able to deal also with novel and case specific events. Furthermore, the integration of ChIP- and RNA-seq data allows the identification all known and novel active transcription start sites within a given sample. Transcription initiation within a peak region is investigated by comparing RNA-seq read densities upstream and downstream of empirically determined TSSs. Closely spaced transcripts are divided into individually transcribed units using the detected active TSSs.

Examples

Run this code

### Load TranscriptionDataSet object
data(tds)

### Load ChipDataSet object
data(cds)

### Load reference annotations (knownGene from UCSC)
data(annot)

### Detect transcripts
detectTranscripts(object = tds, coverage.cutoff = 5, gap.dist = 4000,
estimate.params = TRUE, combine.by.annot = FALSE, annot = annot)

### Classify peaks on gene associated and background
predictTssOverlap(object = cds, feature = "pileup", p = 0.75)

### Predict peak 'strand'
predictStrand(cdsObj = cds, tdsObj = tds, coverage.cutoff = 5,
quant.cutoff = 0.1, win.size = 2500)

### If `estimate.params = TRUE`, FPKM and coverage density will be re-calculated
breakTranscriptsByPeaks(tdsObj = tds, cdsObj = cds, estimate.params = TRUE)

### View detected transcripts
getTranscripts(tds)