## Element-wise (aka "parallel") comparison of 2 Ranges objects
## ------------------------------------------------------------
"compare"(x, y)
rangeComparisonCodeToLetter(code)
## match()
## -------
"match"(x, table, nomatch=NA_integer_, incomparables=NULL, method=c("auto", "quick", "hash"))
## selfmatch()
## -----------
"selfmatch"(x, method=c("auto", "quick", "hash"))
## order()
## -------
"order"(..., na.last=TRUE, decreasing=FALSE)
compare
.
integer
.
method="quick"
) or a
hash-based (method="hash"
) algorithm.
The latter tends to give better performance, except maybe for some
pathological input that we've not been able to determine so far. When method="auto"
is specified, the most efficient algorithm will
be used, that is, the hash-based algorithm if length(x) <= 2^29<="" code="">,
otherwise the Quicksort-based algorithm.
TRUE
or FALSE
.
Ranges are ordered by starting position first, and then by width.
This way, the space of ranges is totally ordered.
On a Ranges object, order
, sort
, and rank
are consistent with this order.
compare(x, y)
:
Performs element-wise (aka "parallel") comparison of 2 Ranges
objects of x
and y
, that is, returns an integer vector
where the i-th element is a code describing how x[i]
is
qualitatively positioned with respect to y[i]
.Here is a summary of the 13 predefined codes (and their letter equivalents) and their meanings:
-6 a: x[i]: .oooo....... 6 m: x[i]: .......oooo. y[i]: .......oooo. y[i]: .oooo.......
-5 b: x[i]: ..oooo...... 5 l: x[i]: ......oooo.. y[i]: ......oooo.. y[i]: ..oooo......
-4 c: x[i]: ...oooo..... 4 k: x[i]: .....oooo... y[i]: .....oooo... y[i]: ...oooo.....
-3 d: x[i]: ...oooooo... 3 j: x[i]: .....oooo... y[i]: .....oooo... y[i]: ...oooooo...
-2 e: x[i]: ..oooooooo.. 2 i: x[i]: ....oooo.... y[i]: ....oooo.... y[i]: ..oooooooo..
-1 f: x[i]: ...oooo..... 1 h: x[i]: ...oooooo... y[i]: ...oooooo... y[i]: ...oooo.....
0 g: x[i]: ...oooooo... y[i]: ...oooooo...
Note that this way of comparing ranges is a refinement over the
standard ranges comparison defined by the ==
, !=
,
<=< code="">,
>=
, <
and >
operators. In particular
a code that is < 0
, = 0
, or > 0
, corresponds to
x[i] < y[i]
, x[i] == y[i]
, or x[i] > y[i]
,
respectively.
The compare
method for Ranges objects is guaranteed
to return predefined codes only but methods for other objects (e.g.
for GenomicRanges objects) can return
non-predefined codes. Like for the predefined codes, the sign of any
non-predefined code must tell whether x[i]
is less than, or
greater than y[i]
.
rangeComparisonCodeToLetter(x)
:
Translate the codes returned by compare
. The 13 predefined
codes are translated as follow: -6 -> a; -5 -> b; -4 -> c; -3 -> d;
-2 -> e; -1 -> f; 0 -> g; 1 -> h; 2 -> i; 3 -> j; 4 -> k; 5-> l; 6 -> m.
Any non-predefined code is translated to X.
The translated codes are returned in a factor with 14 levels:
a, b, ..., l, m, X.
match(x, table, nomatch=NA_integer_, method=c("auto", "quick", "hash"))
:
Returns an integer vector of the length of x
,
containing the index of the first matching range in table
(or nomatch
if there is no matching range) for each range
in x
.
selfmatch(x, method=c("auto", "quick", "hash"))
:
Equivalent to, but more efficient than,
match(x, x, method=method)
.
duplicated(x, fromLast=FALSE, method=c("auto", "quick", "hash"))
:
Determines which elements of x
are equal to elements
with smaller subscripts, and returns a logical vector indicating
which elements are duplicates. duplicated(x)
is equivalent to,
but more efficient than, duplicated(as.data.frame(x))
on a
Ranges object.
See duplicated
in the base package for more
details.
unique(x, fromLast=FALSE, method=c("auto", "quick", "hash"))
:
Removes duplicate ranges from x
. unique(x)
is equivalent
to, but more efficient than, unique(as.data.frame(x))
on a
Ranges object.
See unique
in the base package for more
details.
x %in% table
:
A shortcut for finding the ranges in x
that match any of
the ranges in table
. Returns a logical vector of length
equal to the number of ranges in x
.
findMatches(x, table, method=c("auto", "quick", "hash"))
:
An enhanced version of match
that returns all the matches
in a Hits object.
countMatches(x, table, method=c("auto", "quick", "hash"))
:
Returns an integer vector of the length of x
containing the
number of matches in table
for each element in x
.
order(...)
:
Returns a permutation which rearranges its first argument (a
Ranges object) into ascending order, breaking ties by further
arguments (also Ranges objects).
sort(x)
:
Sorts x
.
See sort
in the base package for more details.
rank(x, na.last=TRUE, ties.method=c("average", "first", "random", "max", "min"))
:
Returns the sample ranks of the ranges in x
.
See rank
in the base package for more details.
findOverlaps
for finding overlapping ranges.
## ---------------------------------------------------------------------
## A. ELEMENT-WISE (AKA "PARALLEL") COMPARISON OF 2 Ranges OBJECTS
## ---------------------------------------------------------------------
x0 <- IRanges(1:11, width=4)
x0
y0 <- IRanges(6, 9)
compare(x0, y0)
compare(IRanges(4:6, width=6), y0)
compare(IRanges(6:8, width=2), y0)
compare(x0, y0) < 0 # equivalent to 'x0 < y0'
compare(x0, y0) == 0 # equivalent to 'x0 == y0'
compare(x0, y0) > 0 # equivalent to 'x0 > y0'
rangeComparisonCodeToLetter(-10:10)
rangeComparisonCodeToLetter(compare(x0, y0))
## Handling of zero-width ranges (a.k.a. empty ranges):
x1 <- IRanges(11:17, width=0)
x1
compare(x1, x1[4])
compare(x1, IRanges(12, 15))
## Note that x1[2] and x1[6] are empty ranges on the edge of non-empty
## range IRanges(12, 15). Even though -1 and 3 could also be considered
## valid codes for describing these configurations, compare()
## considers x1[2] and x1[6] to be *adjacent* to IRanges(12, 15), and
## thus returns codes -5 and 5:
compare(x1[2], IRanges(12, 15)) # -5
compare(x1[6], IRanges(12, 15)) # 5
x2 <- IRanges(start=c(20L, 8L, 20L, 22L, 25L, 20L, 22L, 22L),
width=c( 4L, 0L, 11L, 5L, 0L, 9L, 5L, 0L))
x2
which(width(x2) == 0) # 3 empty ranges
x2[2] == x2[2] # TRUE
x2[2] == x2[5] # FALSE
x2 == x2[4]
x2 >= x2[3]
## ---------------------------------------------------------------------
## B. match(), selfmatch(), %in%, duplicated(), unique()
## ---------------------------------------------------------------------
table <- x2[c(2:4, 7:8)]
match(x2, table)
x2 %in% table
duplicated(x2)
unique(x2)
## ---------------------------------------------------------------------
## C. findMatches(), countMatches()
## ---------------------------------------------------------------------
findMatches(x2, table)
countMatches(x2, table)
x2_levels <- unique(x2)
countMatches(x2_levels, x2)
## ---------------------------------------------------------------------
## D. order() AND RELATED METHODS
## ---------------------------------------------------------------------
order(x2)
sort(x2)
rank(x2, ties.method="first")
Run the code above in your browser using DataLab