Learn R Programming

Ringo (version 1.36.0)

regionOverlap: Function to compute overlap of genomic regions

Description

Given two data frames of genomic regions, this function computes the base-pair overlap, if any, between every pair of regions from the two lists.

Usage

regionOverlap(xdf, ydf, chrColumn = "chr", startColumn = "start", endColumn = "end", mem.limit=1e8)

Arguments

xdf
data.frame that holds the first set of genomic regions
ydf
data.frame that holds the first set of genomic regions
chrColumn
character; what is the name of the column that holds the chromosome name of the regions in xdf and ydf
startColumn
character; what is the name of the column that holds the start position of the regions in xdf and ydf
endColumn
character; what is the name of the column that holds the start position of the regions in xdf and ydf
mem.limit
integer value; what is the maximal allowed size of matrices during the computation

Value

Originally, a matrix with nrow(xdf) rows and nrow(ydf) columns, in which entry X[i,j] specifies the length of the overlap between region i of the first list (xdf) and region j of the second list (ydf). Since this matrix is very sparse, we use the dgCMatrix representation from the Matrix package for it.

See Also

dgCMatrix-class

Examples

Run this code
  ## toy example:
  regionsH3ac <- data.frame(chr=c("chr1","chr7","chr8","chr1","chrX","chr8"), start=c(100,100,100,510,100,60), end=c(200, 200, 200,520,200,80))
  regionsH4ac <- data.frame(chr=c("chr1","chr2","chr7","chr8","chr9"),
start=c(500,100,50,80,100), end=c(700, 200, 250, 120,200))

  ## compare the regions first by eye
  ##  which ones do overlap and by what amount?
  regionsH3ac
  regionsH4ac

  ## compare it to the result:
  as.matrix(regionOverlap(regionsH3ac, regionsH4ac))
  nonzero(regionOverlap(regionsH3ac, regionsH4ac))

Run the code above in your browser using DataLab