data.table
parlance, all set*
functions change their input by reference. That is, no copy is made at all, other than temporary working memory, which is as large as one column.. The only other data.table
operator that modifies input by reference is :=
. Check out the See Also
section below for other set*
function data.table
provides.
setorder
(and setorderv
) reorders the rows of a data.table
based on the columns (and column order) provided. It reorders the table by reference and is therefore very memory efficient.
Also x[order(.)]
is now optimised internally to use data.table's fast order by default. data.table always reorders in C-locale. To sort by session locale, use x[base::order(.)]
instead.
bit64::integer64
type is also supported for reordering rows of a data.table
.
setorder(x, ..., na.last=FALSE)
setorderv(x, cols, order=1L, na.last=FALSE)
# optimised to use data.table's internal fast order
# x[order(., na.last=TRUE)]
data.table
. ...
is missing (ex: setorder(x)
), x
is rearranged based on all columns in ascending order by default. To sort by a column in descending order prefix a "-"
, i.e., setorder(x, a, -b, c)
. The -b
works when b
is of type character
as well. x
, to which to order by. Do not add "-"
here. Use order
argument instead.1
and -1
, corresponding to ascending and descending order. The length of order
must be either 1
or equal to that of cols
. If length(order) == 1
, it's recycled to length(cols)
. TRUE
, missing values in the data are placed last; if FALSE
, they are placed first; if NA
they are removed. na.last=NA
is valid only for x[order(., na.last)]
and it's default is TRUE
. setorder
and setorderv
only accept TRUE/FALSE with default FALSE
.setorder(DT,a,-b)[, cumsum(c), by=list(a,b)]
. If you require a copy, take a copy first (using DT2 = copy(DT)
). See ?copy
.
data.table
implements fast radix based ordering. In versions <= 1.9.2,="" it="" was="" only="" capable="" of="" increasing="" order="" (ascending).="" from="" 1.9.4="" on,="" the="" functionality="" has="" been="" extended="" to="" decreasing="" (descending)="" as="" well.="" columns="" numeric types (i.e., double
) have their last two bytes rounded off while computing order, by defalult, to avoid any unexpected behaviour due to limitations in representing floating point numbers precisely. Have a look at setNumericRounding
to learn more. setorder
accepts unquoted column names (with names preceded with a -
sign for descending order) and reorders data.table rows by reference, for e.g., setorder(x, a, -b, c)
. Note that -b
also works with columns of type character
unlike base::order
, which requires -xtfrm(y)
instead (which is slow). setorderv
in turn accepts a character vector of column names and an integer vector of column order separately.
Note that setkey
still requires and will always sort only in ascending order, and is different from setorder
in that it additionally sets the sorted
attribute.
na.last
argument, by default, is FALSE
for setorder
and setorderv
to be consistent with data.table
's setkey
and is TRUE
for x[order(.)]
to be consistent with base::order
. Only x[order(.)]
can have na.last = NA
as it's a subset operation as opposed to setorder
or setorderv
which reorders the data.table by reference.
If setorder
results in reordering of the rows of a keyed data.table
, then it's key will be set to NULL
.
=>
setkey
, setcolorder
, setattr
, setnames
, set
, :=
, setDT
, setDF
, copy
, setNumericRounding
set.seed(45L)
DT = data.table(A=sample(3, 10, TRUE),
B=sample(letters[1:3], 10, TRUE), C=sample(10))
# setorder
setorder(DT, A, -B)
# same as above, but using setorderv
setorderv(DT, c("A", "B"), c(1, -1))
Run the code above in your browser using DataLab