consolidateSizes: Consolidate window sizes

Description

Consolidate DB results from multiple window sizes.

Usage

consolidateSizes(data.list, result.list, equiweight=TRUE, merge.args=list(), combine.args=list(), region=NULL, overlap.args=list())

Arguments

data.list

a list of RangedSummarizedExperiment and/or GRanges objects

result.list

a list of data frames containing the DB test results for each entry of data.list

equiweight

a logical scalar indicating whether equal weighting from each window size should be enforced

merge.args

a named list of parameters to pass to mergeWindows

combine.args

a named list of parameters to pass to combineTests

region

a GRanges object specifying regions of interest for overlapping with windows

overlap.args

a named list of parameters to pass to findOverlaps

Value

id: a list of integer vectors, where each vector corresponds to an object in data.list; the entries of the vector specify the cluster to which each row of that object is assigned
region: a GRanges object containing the genomic coordinates of the clusters of merged windows (or other regions, if region is specified)
table: a data frame containing the combined DB results for each region

Details

This function consolidates DB results from multiple window sizes, to provide comprehensive detection of DB at a range of spatial resolutions. RangedSummarizedExperiment objects can be generated by running windowCounts at a range of window sizes. Windows of all sizes are clustered together through mergeWindows, and the p-values from all windows in each cluster are combined using combineTests. If merge.args$tol is not specified, it is automatically set to 100 bp and a warning is raised.

Some effort is required to equalize the contribution of each window size to the combined p-value of each cluster. This is done by setting equiweight=TRUE, where the weight of each window is inversely proportional to the number of windows of that size. Otherwise, the combined p-value would be determined by numerous small windows in each cluster.

If region is specified, each entry of region is defined as a cluster. Windows in each cluster are identified using findOverlaps, and consolidation is performed across multiple window sizes like before. Note that the returned id will be a list of Hits objects rather than integer vectors, as one window (subject) may overlap more than one region (query).

Examples

Run this code

bamFiles <- system.file("exdata", c("rep1.bam", "rep2.bam"), package="csaw")
low <- windowCounts(bamFiles, width=1, filter=1)
med <- windowCounts(bamFiles, width=100, filter=1)
high <- windowCounts(bamFiles, width=500, filter=1)

# Making up some DB results.
dbl <- data.frame(logFC=rnorm(nrow(low)), PValue=runif(nrow(low)), logCPM=0)
dbm <- data.frame(logFC=rnorm(nrow(med)), PValue=runif(nrow(med)), logCPM=0)
dbh <- data.frame(logFC=rnorm(nrow(high)), PValue=runif(nrow(high)), logCPM=0)

# Consolidating.
cons <- consolidateSizes(list(low, med, high), list(dbl, dbm, dbh),
	merge.args=list(tol=100, max.width=300))
cons$region
cons$id
cons$table

# Without weights.
cons <- consolidateSizes(list(low, med, high), list(dbl, dbm, dbh),
	merge.args=list(tol=100, max.width=300), equiweight=FALSE)
cons$table

# Trying with a custom region.
of.interest <- GRanges(c('chrA', 'chrA', 'chrB', 'chrC'), 
    IRanges(c(1, 500, 100, 1000), c(200, 1000, 700, 1500)))
cons <- consolidateSizes(list(low, med, high), list(dbl, dbm, dbh),
    region=of.interest)
cons$table
cons$id

# Trying with limited numbers of overlaps; empty regions are ignored.
cons <- consolidateSizes(list(low[1,], med[1,], high[1,]), 
    list(dbl[1,], dbm[1,], dbh[1,]), region=of.interest)
cons$region
cons$table

Run the code above in your browser using DataLab