data.table (version 1.7.6)

setkey: Create key on a data table

Description

Sorts a data.table and marks it as sorted. The sorted columns are the key. The key can be any columns in any order. The columns are sorted in ascending order always.

Usage

setkey(x, ..., loc=parent.frame(),verbose=getOption("datatable.verbose",FALSE))
key(x)
key(x) <- value
haskey(x)
copy(x)

Arguments

x
An unquoted name of a data.table.
...
The columns to sort by. Do not quote the column names. If ... is missing all the columns are used.
value
A character vector of column names.
loc
The data.table must already exist in this frame and is sorted by reference in this frame. loc=.GlobalEnv is often useful within functions.
verbose
Output status and information.

Value

  • No value is returned. The data.table is modified by reference. If you require a copy, take a copy first (using DT2=copy(DT)). copy() may also sometimes be useful before := is used to subassign to a column by reference.

Details

The sort is attempted with the very fast "radix" method in sort.list. If that fails, the sort reverts to the default method in order. That logic is repeated column by column. The sort is stable; i.e., the order of ties (if any) is preserved. If value=NULL, the key is removed.

References

http://en.wikipedia.org/wiki/Radix_sort http://en.wikipedia.org/wiki/Counting_sort

See Also

data.table, tables, J, sort.list html{}

Examples

Run this code
DT = data.table(A=5:1,B=letters[5:1])
    DT # before
    setkey(DT,B)  # re-orders table and marks it sorted.
    DT # after
    tables()      # KEY column reports the key'd columns
    key(DT)
    key(DT) = "A"
    
    DT = data.table(A=5:1,B=letters[5:1])
    DT2 = DT              # not enough to copy
    setkey(DT2,B)         # does not copy on write to DT2
    identical(DT,DT2)     # TRUE. DT and DT2 are two names for the same keyed table
    
    DT = data.table(A=5:1,B=letters[5:1])
    DT2 = copy(DT)        # explicit copy is required for data.table
    setkey(DT2,B)         # just changes DT2
    identical(DT,DT2)     # FALSE. DT and DT2 are now different tables

Run the code above in your browser using DataCamp Workspace