Learn R Programming

csaw (version 1.2.1)

dumpPE: Dump paired-end data to file

Description

Extract proper pairs from a BAM file, and dump fragment intervals into another BAM file.

Usage

dumpPE(bam.file, prefix, param=readParam(pe="both"), overwrite=FALSE)

Arguments

bam.file
a character string containing the path to a paired-end BAM file
prefix
a character string containing the prefix to an output BAM file
param
a readParam object
overwrite
a logical scalar indicating whether to overwrite the existing file starting with prefix

Value

A sorted and indexed BAM file is produced at the specified location. A character string containing the full name of the output BAM file is silently returned.

Details

This function extracts proper pairs from bam.file according to the settings in param. It then generates another output BAM file that stores fragment information for each proper pair. Each alignment entry represents the forward-stranded read of a valid fragment. Fragment data can be extracted as the read position and insert size.

The idea is to generate a pre-filtered BAM file by specifying the appropriate settings in param. The output BAM file can then be efficiently analyzed with, e.g., windowCounts with fast.pe=TRUE. This avoids the need to load and match read names in every function. Note that all alignment-specific information is lost, e.g., MAPQ scores, duplicate information, CIGAR strings.

See Also

readParam, windowCounts

Examples

Run this code
# Loading PE data.
bamFile <- system.file("exdata", "pet.bam", package="csaw")

xparam <- readParam(pe="both")
out <- windowCounts(bamFile, param=xparam, filter=1)

outBam <- dumpPE(bamFile, "whee", param=xparam)
out2 <- windowCounts(outBam, param=reform(xparam, fast.pe=TRUE), filter=1)

stopifnot(identical(assay(out), assay(out2)))
stopifnot(identical(out$totals, out2$totals))
unlink("whee.bam")

# Comparing it to a more complex scenario.
xparam <- readParam(pe="both", rescue.ext=200)
out <- windowCounts(bamFile, param=xparam, filter=1)

outBam <- dumpPE(bamFile, "whee", param=xparam)
out2 <- windowCounts(outBam, param=reform(xparam, fast.pe=TRUE), filter=1)

stopifnot(identical(assay(out), assay(out2)))
stopifnot(identical(out$totals, out2$totals))

# Looking at extracted reads.
last.reg <- rowRanges(out)[6]
reg1 <- extractReads(last.reg, bamFile, param=xparam)
reg2 <- extractReads(last.reg, outBam, param=reform(xparam, fast.pe=TRUE))

stopifnot(identical(sort(reg1), sort(reg2)))
unlink("whee.bam")

Run the code above in your browser using DataLab