The Ranges virtual class is a general container for storing a set of integer ranges.
A Ranges object is a vector-like object where each element describes a "range of integer values".
A "range of integer values" is a finite set of consecutive integer values. Each range can be fully described with exactly 2 integer values which can be arbitrarily picked up among the 3 following values: its "start" i.e. its smallest (or first, or leftmost) value; its "end" i.e. its greatest (or last, or rightmost) value; and its "width" i.e. the number of integer values in the range. For example the set of integer values that are greater than or equal to -20 and less than or equal to 400 is the range that starts at -20 and has a width of 421. In other words, a range is a closed, one-dimensional interval with integer end points and on the domain of integers.
The starting point (or "start") of a range can be any integer (see
start below) but its "width" must be a non-negative integer
width below). The ending point (or "end") of a range is
equal to its "start" plus its "width" minus one (see
An "empty" range is a range that contains no value i.e. a range that
has a null width. Depending on the context, it can be interpreted
either as just the empty set of integers or, more precisely,
as the position between its "end" and its "start" (note that
for an empty range, the "end" equals the "start" minus one).
The length of a Ranges object is the number of ranges in it, not the number of integer values in its ranges.
A Ranges object is considered empty iff all its ranges are empty.
Ranges objects have a vector-like semantic i.e. they only support single subscript subsetting (unlike, for example, standard R data frames which can be subsetted by row and by column).
In the code snippets below,
Ranges objects. Not all the functions described below will necessarily
work with all kinds of Ranges objects but they should work at least
for IRanges objects. Note that many more operations on Ranges objects are described in other
man pages of the IRanges package. See for example the man page for intra
range transformations (e.g.
?`intra-range-methods`), or the man page for inter range
?`inter-range-methods`), or the man page for
findOverlaps methods (see
or the man page for RangesList objects where the
method for Ranges objects is documented.
length(x): The number of ranges in
start(x): The start values of the ranges. This is an integer vector of the same length as
width(x): The number of integer values in each range. This is a vector of non-negative integers of the same length as
start(x) + width(x) - 1L
mid(x): returns the midpoint of the range,
start(x) + floor((width(x) - 1)/2).
NULLor a character vector of the same length as
update(object, ...): Convenience method for combining multiple modifications of
objectin one single call. For example
object <- update(object, start=start(object)-2L, end=end(object)+2L)is equivalent to
start(object) <- start(object)-2L; end(object) <- end(object)+2L.
tile(x, n, width, ...): Splits each range in
xinto subranges as specified by
n(number of ranges) or
width. Only one of
widthcan be specified. The return value is a
IRangesListthe same length as
x. Ranges with a width less than the
widthargument are returned unchanged.
isEmpty(x): Return a logical value indicating whether
xis empty or not.
as.matrix(x, ...): Convert
xinto a 2-column integer matrix containing
width(x). Extra arguments (
...) are ignored.
as.data.frame(x, row.names=NULL, optional=FALSE, ...): Convert
xinto a standard R data frame object.
NULLor a character vector giving the row names for the data frame, and
optionaland any additional argument (
...) is ignored. See
?as.data.framefor more information about these arguments.
xinto an integer vector, by converting each range into the integer sequence formed by
from:toand concatenating them together.
unlist(x, recursive = TRUE, use.names = TRUE): Similar to
as.integer(x)except can add names to elements.
x[[i]]: Return integer vector
ican be a single integer or a character string.
x[i]: Return a new Ranges object (of the same type as
x) made of the selected ranges.
ican be a numeric vector, a logical vector,
NULLor missing. If
xis a NormalIRanges object and
ia positive numeric subscript (i.e. a numeric vector of positive values), then
imust be strictly increasing.
rep(x, times, length.out, each): Repeats the values in
xthrough one of the following conventions:
- Vector giving the number of times to repeat each
element if of length
length(x), or to repeat the Ranges elements if of length 1.
- Non-negative integer. The desired length of the output vector.
- Non-negative integer. Each element of
c(x, ...): Combine
xand the Ranges objects in
...together. Any object in
...must belong to the same class as
x, or to one of its subclasses, or must be
NULL. The result is an object of the same class as
x. NOTE: Only works for IRanges (and derived) objects for now.
x * y: The arithmetic operation
x * yis for centered zooming. It symmetrically scales the width of
yis a numeric vector that is recycled as necessary. For example,
x * 2results in ranges with half their previous width but with approximately the same midpoint. The ranges have been zoomed in. If
yis negative, it is equivalent to
x * (1/abs(y)). Thus,
x * -2would double the widths in
x. In other words,
xhas been zoomed out.
x + y: Expands the ranges in
xon either side by the corresponding value in the numeric vector
show(x): By default the
showmethod displays 5 head and 5 tail lines. The number of lines can be altered by setting the global options
showTailLines. If the object length is less than the sum of the options, the full object is displayed. These options affect GRanges, GAlignments, Ranges and XString objects.
A Ranges object
x is implicitly representing an arbitrary finite
set of integers (that are not necessarily consecutive). This set is the
set obtained by taking the union of all the values in all the ranges in
x. This representation is clearly not unique: many different
Ranges objects can be used to represent the same set of integers.
However one and only one of them is guaranteed to be "normal". By definition a Ranges object is said to be "normal" when its ranges are:
(a) not empty (i.e. they have a non-null width);
(b) not overlapping;
(c) ordered from left to right;
(d) not even adjacent (i.e. there must be a non empty gap between 2
consecutive ranges). Here is a simple algorithm to determine whether
x is "normal":
length(x) == 0, then
x is normal;
length(x) == 1, then
x is normal iff
width(x) >= 1;
length(x) >= 2, then
x is normal iff:
start(x)[i] <= end(x)[i]="" <="" start(x)[i+1]=""> for every 1 <=
length(x). The obvious advantage of using a "normal" Ranges object to represent a given finite set of integers is that it is the smallest in terms of number of ranges and therefore in terms of storage space. Also the fact that we impose its ranges to be ordered from left to right makes it unique for this representation. A special container (NormalIRanges) is provided for holding a "normal" IRanges object: a NormalIRanges object is just an IRanges object that is guaranteed to be "normal". Here are some methods related to the notion of "normal" Ranges:
isNormal(x): Return TRUE or FALSE indicating whether
xis "normal" or not.
xis normal, or the smallest valid indice
x[1:i]is not "normal".
A Ranges object
x is considered to be "disjoint" if its ranges are
disjoint (i.e. non-overlapping). The
isDisjoint function is provided for testing whether a Ranges
object is "disjoint" or not:
isDisjoint(x): Return TRUE or FALSE indicating whether
xis "disjoint" or not.
isDisjointhandles empty ranges (a.k.a. zero-width ranges) as follow: single empty range A is considered to overlap with single range B iff it's contained in B without being on the edge of B (in which case it would be ambiguous whether A is contained in or adjacent to B). In other words, single empty range A is considered to overlap with single range B iff
start(B) < start(A) and end(A) < end(B)Because A is an empty range it verifies
end(A) = start(A) - 1so the above is equivalent to:
start(B) < start(A) <= end(b)<="" pre=""> and also equivalent to:
start(B) <= end(a)="" <="" end(b)<="" pre=""> Finally, it is also equivalent to:=>
compare(A, B) == 2See
?`Ranges-comparison`for the meaning of the codes returned by the
## --------------------------------------------------------------------- ## Basic manipulation ## --------------------------------------------------------------------- x <- IRanges(start=c(2:-1, 13:15), width=c(0:3, 2:0)) x length(x) start(x) width(x) end(x) isEmpty(x) as.matrix(x) as.data.frame(x) ## Subsetting: x[4:2] # 3 ranges x[-1] # 6 ranges x[FALSE] # 0 range x0 <- x[width(x) == 0] # 2 ranges isEmpty(x0) ## Use the replacement methods to resize the ranges: width(x) <- width(x) * 2 + 1 x end(x) <- start(x) # equivalent to width(x) <- 0 x width(x) <- c(2, 0, 4) x start(x) <- end(x) - 2 # resize the 3rd range x ## Name the elements: names(x) names(x) <- c("range1", "range2") x x[is.na(names(x))] # 5 ranges x[!is.na(names(x))] # 2 ranges ir <- IRanges(c(1,5), c(3,10)) ir*1 # no change ir*c(1,2) # zoom second range by 2X ir*-2 # zoom out 2X ## --------------------------------------------------------------------- ## isDisjoint() ## --------------------------------------------------------------------- ## On a Ranges object: isDisjoint(IRanges(c(2,5,1), c(3,7,3))) # FALSE isDisjoint(IRanges(c(2,9,5), c(3,9,6))) # TRUE isDisjoint(IRanges(1, 5)) # TRUE ## Handling of empty ranges: x <- IRanges(c(11, 16, 11, -2, 11), c(15, 29, 10, 10, 10)) stopifnot(isDisjoint(x)) ## Sliding an empty range along a non-empty range: sapply(11:17, function(i) compare(IRanges(i, width=0), IRanges(12, 15))) sapply(11:17, function(i) isDisjoint(c(IRanges(i, width=0), IRanges(12, 15))))