Learn R Programming

csaw (version 1.2.1)

extractReads: Extract reads from a BAM file

Description

Extract reads from a BAM file with the specified parameter settings.

Usage

extractReads(cur.region, bam.file, ext=NA, param=readParam())

Arguments

cur.region
a GRanges object of length 1 describing the region of interest
bam.file
a character string containing the path to a sorted and indexed BAM file
ext
an integer scalar or vector, containing the fragment lengths for directional read extension
param
a readParam object specifying how reads should be extracted

Value

A GRanges object is returned. If pe="both" in param, intervals are unstranded and correspond to fragments. Otherwise, strand-specific intervals that represent reads are returned.

Details

This function extracts the reads from a BAM file overlapping a given genomic interval. The interpretation of the values in param is the same as that throughout the package. The aim is to supply the raw data for visualization, in a manner that maintains consistency with the rest of the analysis.

Note that this does not account for any read extension that might have been performed during read counting. In such cases, users are advised to expand cur.region by the extension length on each side. Counted reads can then be extracted by identifying their extended counterparts that overlap with the original cur.region.

Any strandedness of cur.region is ignored. If strand-specific extraction is desired, this can be done by setting param$forward via reform. Alternatively, the returned GRanges can be filtered to retain only the desired strand.

If ext is not NA, directional read extension will be performed for single-end data. See windowCounts for more details.

See Also

readParam, windowCounts

Examples

Run this code
bamFile <- system.file("exdata", "rep1.bam", package="csaw")
extractReads(GRanges("chrA", IRanges(100, 500)), bamFile)
extractReads(GRanges("chrA", IRanges(100, 500)), bamFile, param=readParam(dedup=TRUE))

bamFile <- system.file("exdata", "pet.bam", package="csaw")
extractReads(GRanges("chrB", IRanges(100, 500)), bamFile)
extractReads(GRanges("chrB", IRanges(100, 500)), bamFile, param=readParam(pe="both"))
extractReads(GRanges("chrB", IRanges(100, 500)), bamFile, param=readParam(pe="first"))

# Dealing with the extension length.
bamFile <- system.file("exdata", "rep1.bam", package="csaw")
ext <- 100
my.reg <- GRanges("chrA", IRanges(10, 200))
my.reg2 <- resize(my.reg, fix="center", width=width(my.reg)+2*ext)
collected <- extractReads(my.reg2, bamFile)

expanded <- resize(collected, width=ext) 
strand(expanded) <- "*"
relevant <- overlapsAny(expanded, my.reg)
collected[relevant,]

Run the code above in your browser using DataLab