Learn R Programming

csaw (version 1.4.0)

extractReads: Extract reads from a BAM file

Description

Extract reads from a BAM file with the specified parameter settings.

Usage

extractReads(bam.file, region, ext=NA, param=readParam(), as.reads=FALSE)

Arguments

bam.file
a character string containing the path to a sorted and indexed BAM file
region
a GRanges object of length 1 describing the region of interest
ext
an integer scalar specifying the fragment length for directional read extension
param
a readParam object specifying how reads should be extracted
as.reads
a logical scalar indicating whether reads should be returned instead of fragments for paired-end data

Value

A GRanges object is returned. If pe="both" in param, intervals are unstranded and correspond to fragments. Otherwise, strand-specific intervals that represent reads are returned.

Details

This function extracts the reads from a BAM file overlapping a given genomic interval. The interpretation of the values in param is the same as that throughout the package. The aim is to supply the raw data for visualization, in a manner that maintains consistency with the rest of the analysis.

Note that this does not account for any read extension that might have been performed during read counting. In such cases, users are advised to expand region by the extension length on each side. Counted reads can then be extracted by identifying their extended counterparts that overlap with the original region.

Any strandedness of region is ignored. If strand-specific extraction is desired, this can be done by setting param$forward via reform. Alternatively, the returned GRanges can be filtered to retain only the desired strand.

If ext is not NA, directional read extension will be performed for single-end data. All extended reads overlapping region will then be extracted, and the intervals of the extended will be reported.. See windowCounts for more details.

By default, unstranded fragments are returned for paired-end data. If as.reads=TRUE, the constituent reads are returned, with pairing indicated by pair in the metadata. Rescued reads are marked as those with pair of zero; see readParam for details on rescuing.

See Also

readParam, windowCounts

Examples

Run this code
bamFile <- system.file("exdata", "rep1.bam", package="csaw")
extractReads(bamFile, GRanges("chrA", IRanges(100, 500)))
extractReads(bamFile, GRanges("chrA", IRanges(100, 500)),
    param=readParam(dedup=TRUE))
extractReads(bamFile, GRanges("chrB", IRanges(100, 500)))

bamFile <- system.file("exdata", "pet.bam", package="csaw")
extractReads(bamFile, GRanges("chrB", IRanges(100, 500)), 
    param=readParam(pe="both"))
extractReads(bamFile, GRanges("chrB", IRanges(100, 500)), 
    param=readParam(pe="both", rescue.ext=100))
extractReads(bamFile, GRanges("chrB", IRanges(100, 500)), 
    param=readParam(pe="first"))

# Extracting as reads.
extractReads(bamFile, GRanges("chrB", IRanges(100, 500)), 
    param=readParam(pe="both"), as.reads=TRUE)
extractReads(bamFile, GRanges("chrB", IRanges(100, 500)), 
    param=readParam(pe="both", rescue.ext=100), as.reads=TRUE)

# Dealing with the extension length.
bamFile <- system.file("exdata", "rep1.bam", package="csaw")
my.reg <- GRanges("chrA", IRanges(10, 200))
extractReads(bamFile, my.reg)
extractReads(bamFile, my.reg, ext=100)

Run the code above in your browser using DataLab