boundingIndices: Find indices of features bounding a set of chromosome ranges/genes
Description
This function is similar to findOverlaps but it guarantees at least two features will be
covered. This is useful in the case of finding features corresponding to a set of genes.
Some genes will fall entirely between two features and thus would not return any ranges
with findOverlaps. Specifically, this function will find the indices of the features
(first and last) bounding the ends of a range/gene (start and stop) such that
first
integer vector of first base position of each query range
stops
integer vector of last base position of each query range
positions
Base positions in which to search
all.indices
logical, return a list containing full sequence of indices for each query
Value
integer matrix of 2 columms for start and stop index of range in data or a list of full sequences of indices for each query (see all.indices argument)
Details
This function uses some tricks from findIntervals, where is for k queries and n features it
is O(k * log(n)) generally and ~O(k) for sorted queries. Therefore will be dramatically
faster for sets of query genes that are sorted by start position within each chromosome.
The index of the stop position for each gene is found using the left bound from the start
of the gene reducing the search space for the stop position somewhat. boundingIndices does not
check for NAs or unsorted data in the subject positions. These assumptions are safe for
position info coming from a GenoSet or GRanges.