data.table
Enhanced data.frame
Same internal structure as data.frame (i.e. list of vectors) but fast subset, fast merge, and fast grouping.
- Keywords
- data
Usage
data.table(..., keep.rownames=FALSE, check.names=TRUE, key=NULL)
Arguments
- ...
- Just as ...in
data.frame
. These arguments are of either the form value or tag = value. Column names are created based on the tag (if present) or the deparsed argument itself.The data.table can be f
- keep.rownames
- If ...is a matrix or data.frame, TRUE will retain the rownames of that object into a column 'rn'.
If the matrix or data.frame already has a column called 'rn' it will be renamed 'rn.1' via
make.names
. - check.names
- Just as check.names in
data.frame
(). For example this replaces spaces in column names with period and ensures column names are valid Robject names. - key
- Character vector of length 1 containing one or more column names separated by comma which is passed to
setkey
Details
data.table creates a data.table from its arguments just as data.frame does. DT() is an alias for data.table() and is often used instead of as.data.table().
A data.table is a list of vectors, just like a data.frame, however :
- itneverhas rownames. Instead it may have an optionalkeyof one or more columns using
setkey
. This key can be used for row indexing instead of rownames. - when the data.table has over 20 rows the print method displays column names at the bottom as well as at the top to save scrolling up at the console.
- character vectors may be passed in but they are automatically converted to factor. A data.table does not allow character columns for time and space reasons.
- howeverthe main differenceis enhanced functionality in
[.data.table
where most documentation for this package lives.
Several methods are provided for data tables, including is.na
, na.omit
,
t
, and others.
Value
- A data.table.
Note
keep.rownames
and check.names
, if suppplied, must be written in full since they appear after
the .... Rdoes not allow partial argument names after .... For example data.table(DF,keep=TRUE)
will create a
column called 'keep' containing TRUE and this is correct behaviour. Most likely data.table(DF,keep.rownames=TRUE)
was intended.
See Also
Examples
DF = data.frame(a=1:5, b=letters[1:5])
DT = data.table(a=1:5, b=letters[1:5])
identical(as.data.table(DF), DT)
identical(data.table(DF), DT)
identical(dim(DT),dim(DF))
identical(DF$a, DT$a)
DT
tables()
identical(data.table(DT,DT), cbind(DT,DT))
DT2=rbind(DT,DT)
DT3 = data.table(A=DT, B=DT, key="A.b")
tables()
test.data.table()
example("[.data.table")