rbindlist
Makes one data.table from a list of many
Same as do.call("rbind", l)
on data.frame
s, but much faster. See DETAILS
for more.
- Keywords
- data
Usage
rbindlist(l, use.names=fill, fill=FALSE, idcol=NULL)
# rbind(\dots, use.names=TRUE, fill=FALSE, idcol=NULL)
Arguments
- l
A list containing
data.table
,data.frame
orlist
objects. At least one of the inputs should have column names set.…
is the same but you pass the objects by name separately.- use.names
If
TRUE
items will be bound by matching column names. By defaultFALSE
forrbindlist
(for backwards compatibility) andTRUE
forrbind
(consistency with base). Columns with duplicate names are bound in the order of occurrence, similar to base. When TRUE, at least one item of the input list has to have non-null column names.- fill
If
TRUE
fills missing columns with NAs. By defaultFALSE
. WhenTRUE
,use.names
has to beTRUE
, and all items of the input list has to have non-null column names.- idcol
Generates an index column. Default (
NULL
) is not to. Ifidcol=TRUE
then the column is auto named.id
. Alternatively the column name can be directly provided, e.g.,idcol = "id"
.If input is a named list, ids are generated using them, else using integer vector from
1
to length of input list. Seeexamples
.
Details
Each item of l
can be a data.table
, data.frame
or list
, including NULL
(skipped) or an empty object (0 rows). rbindlist
is most useful when there are a variable number of (potentially many) objects to stack, such as returned by lapply(fileNames, fread)
. rbind
however is most useful to stack two or three objects which you know in advance. …
should contain at least one data.table
for rbind(…)
to call the fast method and return a data.table
, whereas rbindlist(l)
always returns a data.table
even when stacking a plain list
with a data.frame
, for example.
In versions <= v1.9.2
, each item for rbindlist
should have the same number of columns as the first non empty item. rbind.data.table
gained a fill
argument to fill missing columns with NA
in v1.9.2
, which allowed for rbind(…)
binding unequal number of columns.
In version > v1.9.2
, these functionalities were extended to rbindlist
(and written entirely in C for speed). rbindlist
has use.names
argument, which is set to FALSE
by default for backwards compatibility. It also contains fill
argument as well and can bind unequal columns when set to TRUE
.
With these changes, the only difference between rbind(…)
and rbindlist(l)
is their default argument use.names
.
If column i
of input items do not all have the same type; e.g, a data.table
may be bound with a list
or a column is factor
while others are character
types, they are coerced to the highest type (SEXPTYPE).
Note that any additional attributes that might exist on individual items of the input list would not be preserved in the result.
Value
An unkeyed data.table
containing a concatenation of all the items passed in.
See Also
Examples
# NOT RUN {
# default case
DT1 = data.table(A=1:3,B=letters[1:3])
DT2 = data.table(A=4:5,B=letters[4:5])
l = list(DT1,DT2)
rbindlist(l)
# bind correctly by names
DT1 = data.table(A=1:3,B=letters[1:3])
DT2 = data.table(B=letters[4:5],A=4:5)
l = list(DT1,DT2)
rbindlist(l, use.names=TRUE)
# fill missing columns, and match by col names
DT1 = data.table(A=1:3,B=letters[1:3])
DT2 = data.table(B=letters[4:5],C=factor(1:2))
l = list(DT1,DT2)
rbindlist(l, use.names=TRUE, fill=TRUE)
# generate index column, auto generates indices
rbindlist(l, use.names=TRUE, fill=TRUE, idcol=TRUE)
# let's name the list
setattr(l, 'names', c("a", "b"))
rbindlist(l, use.names=TRUE, fill=TRUE, idcol="ID")
# }