Learn R Programming

csaw (version 1.4.0)

checkBimodality: Check bimodality of regions

Description

Compute the maximum bimodality score across all base pairs in each region.

Usage

checkBimodality(bam.files, regions, width=100, param=readParam(), prior.count=2, invert=FALSE)

Arguments

bam.files
a character vector containing paths to sorted and indexed BAM files
regions
a GRanges object specifying the regions over which bimodality is to be calculated
width
an integer vector indicating the span with which to compute bimodality
param
a readParam object containing read extraction parameters
prior.count
a numeric scalar specifying the prior count to compute bimodality scores
invert
a logical scalar indicating whether bimodality score should be inverted

Value

A numeric vector containing the maximum bimodality score across all bases in each region.

Details

Consider a base position x. This function counts the number of forward- and reverse-strand reads within the interval [x-width+1, x]. It then calculates the forward:reverse ratio after adding prior.count to both counts. This is repeated for the interval [x, x+width-1], and the reverse:forward ratio is then computed. The smaller of these two ratios is used as the bimodality score.

Sites with high bimodality scores will be enriched for forward- and reverse-strand enrichment on the left and right of the site, respectively. Given a genomic region, this function will treat each base position as a site. The largest bimodality score across all positions will be reported for each region. The idea is to assist with the identification of transcription factor binding sites, which exhibit strong strand bimodality. The function will be less useful for broad targets like histone marks.

If multiple bam.files are specified, they are effectively pooled so that counting uses all reads in all files. The value of width should be set to the average fragment length, as this is allows optimal counting of reads. Specifically, all reads in the subpeaks on either strand will be counted when x is set to the center of the peak. A separate value can be specified for each library, to account for differences in fragmentation.

If invert is set, the bimodality score will be flipped around, i.e., it will be maximized when reverse-strand coverage dominates on the left, and forward-strand coverage dominates on the right. This is designed for use in CAGE analyses where this inverted bimodality is symptomatic of enhancer RNAs.

Examples

Run this code
bamFiles <- system.file("exdata", c("rep1.bam", "rep2.bam"), package="csaw")
incoming <- GRanges(c('chrA', 'chrA', 'chrB', 'chrC'), 
    IRanges(c(1, 500, 100, 1000), c(100, 580, 500, 1500)))

checkBimodality(bamFiles, incoming)
checkBimodality(bamFiles, incoming, width=c(100, 200))
checkBimodality(bamFiles, incoming, param=readParam(minq=20, dedup=TRUE))
checkBimodality(bamFiles, incoming, prior.count=5)

# Works on PE data; scores are computed from paired/rescued reads.
bamFile <- system.file("exdata", "pet.bam", package="csaw")
checkBimodality(bamFile, incoming[1:3], param=readParam(pe="both"))
checkBimodality(bamFile, incoming[1:3], param=readParam(pe="both", 
    rescue.ext=50, max.frag=100))

Run the code above in your browser using DataLab