Learn R Programming

diffHic (version 1.4.2)

getPairData: Get read pair data

Description

Extract diagnostics for each read pair from an index file

Usage

getPairData(file, param)

Arguments

file
character string, specifying the path to the index file produced by preparePairs
param
a pairParam object containing read extraction parameters

Value

A dataframe is returned containing integer fields for length, orientation and insert for each read pair.

Details

This is a convenience function to extract read pair diagnostics from an index file, generated from a Hi-C library with preparePairs. The aim is to examine the distribution of each returned value to determine the appropriate cutoffs for prunePairs.

The length refers to the length of the DNA fragment used in sequencing. It is computed for each read pair by adding the distance of each read to the closest restriction site in the direction of the read.

The insert simply refers to the insert size for each read pair. This is defined as the distance between the extremes of each read on the same chromosome. Values for interchromosomal pairs are set to NA.

For orientation, setting 0x1 or 0x2 means that the read mapped into the first or second anchor fragment respectively is on the reverse strand. For intrachromosomal reads, an orientation value of 1 represents inward-facing reads whereas a value of 2 represents outward-facing reads.

Note that a pairParam object is only used here for consistency. Specifically, the restriction fragment coordinates in param$fragments are required. No removal of read pairs will be performed here, so the values of param$restrict or param$discard will be ignored.

See Also

preparePairs, prunePairs

Examples

Run this code
hic.file <- system.file("exdata", "hic_sort.bam", package="diffHic")
cuts <- readRDS(system.file("exdata", "cuts.rds", package="diffHic"))
param <- pairParam(cuts)

tmpf <- "gunk.h5"
invisible(preparePairs(hic.file, param, tmpf))
getPairData(tmpf, param)


Run the code above in your browser using DataLab