Last chance! 50% off unlimited learning
Sale ends in
An S3 method for melting data.table
s written in C for speed and memory
efficiency. Since v1.9.6
, melt.data.table
allows melting into
multiple columns simultaneously.
It is not necessary to load reshape2
anymore. But if you have to, then
load reshape2
package before loading data.table
.
## fast melt a data.table
# S3 method for data.table
melt(data, id.vars, measure.vars,
variable.name = "variable", value.name = "value",
..., na.rm = FALSE, variable.factor = TRUE,
value.factor = FALSE,
verbose = getOption("datatable.verbose"))
A data.table
object to melt.
vector of id variables. Can be integer (corresponding id column numbers) or character (id column names) vector. If missing, all non-measure columns will be assigned to it.
vector of measure variables. Can be integer (corresponding measure column numbers) or character (measure column names) vector. If missing, all non-id columns will be assigned to it.
measure.vars
also now accepts a list of character/integer vectors to
melt into multiple columns - i.e., melt into more than one value
columns
simultaneously. Use patterns
to provide multiple patterns
conveniently. See also Examples
.
name for the measured variable names column. The default name is 'variable'.
name for the molten data values column. The default name is 'value'.
If TRUE
, NA
values will be removed from the molten
data.
If TRUE
, the variable
column will be
converted to factor
, else it will be a character
column.
If TRUE
, the value
column will be converted
to factor
, else the molten value type is left unchanged.
TRUE
turns on status and information messages to the
console. Turn this on by default using options(datatable.verbose=TRUE)
.
The quantity and types of verbosity may be expanded in future.
any other arguments to be passed to/from other methods.
An unkeyed data.table
containing the molten data.
If id.vars
and measure.vars
are both missing, all
non-numeric/integer/logical
columns are assigned as id variables and
the rest as measure variables. If only one of id.vars
or
measure.vars
is supplied, the rest of the columns will be assigned to
the other. Both id.vars
and measure.vars
can have the same column
more than once and the same column can be both as id and measure variables.
melt.data.table
also accepts list
columns for both id and measure
variables.
When all measure.vars
are not of the same type, they'll be coerced
according to the hierarchy list
> character
> numeric >
integer > logical
. For example, if any of the measure variables is a
list
, then entire value column will be coerced to a list. Note that,
if the type of value
column is a list, na.rm = TRUE
will have no
effect.
From version 1.9.6
, melt
gains a feature with measure.vars
accepting a list of character
or integer
vectors as well to melt
into multiple columns in a single function call efficiently. The function
patterns
can be used to provide regular expression patterns. When
used along with melt
, if cols
argument is not provided, the
patterns will be matched against names(data)
, for convenience.
Attributes are preserved if all value
columns are of the same type. By
default, if any of the columns to be melted are of type factor
, it'll
be coerced to character
type. This is to be compatible with
reshape2
's melt.data.frame
. To get a factor
column, set
value.factor = TRUE
. melt.data.table
also preserves
ordered
factors.
# NOT RUN {
set.seed(45)
require(data.table)
DT <- data.table(
i_1 = c(1:5, NA),
i_2 = c(NA,6,7,8,9,10),
f_1 = factor(sample(c(letters[1:3], NA), 6, TRUE)),
f_2 = factor(c("z", "a", "x", "c", "x", "x"), ordered=TRUE),
c_1 = sample(c(letters[1:3], NA), 6, TRUE),
d_1 = as.Date(c(1:3,NA,4:5), origin="2013-09-01"),
d_2 = as.Date(6:1, origin="2012-01-01"))
# add a couple of list cols
DT[, l_1 := DT[, list(c=list(rep(i_1, sample(5,1)))), by = i_1]$c]
DT[, l_2 := DT[, list(c=list(rep(c_1, sample(5,1)))), by = i_1]$c]
# id, measure as character/integer/numeric vectors
melt(DT, id=1:2, measure="f_1")
melt(DT, id=c("i_1", "i_2"), measure=3) # same as above
melt(DT, id=1:2, measure=3L, value.factor=TRUE) # same, but 'value' is factor
melt(DT, id=1:2, measure=3:4, value.factor=TRUE) # 'value' is *ordered* factor
# preserves attribute when types are identical, ex: Date
melt(DT, id=3:4, measure=c("d_1", "d_2"))
melt(DT, id=3:4, measure=c("i_1", "d_1")) # attribute not preserved
# on list
melt(DT, id=1, measure=c("l_1", "l_2")) # value is a list
melt(DT, id=1, measure=c("c_1", "l_1")) # c1 coerced to list
# on character
melt(DT, id=1, measure=c("c_1", "f_1")) # value is char
melt(DT, id=1, measure=c("c_1", "i_2")) # i2 coerced to char
# on na.rm=TRUE. NAs are removed efficiently, from within C
melt(DT, id=1, measure=c("c_1", "i_2"), na.rm=TRUE) # remove NA
# measure.vars can be also a list
# melt "f_1,f_2" and "d_1,d_2" simultaneously, retain 'factor' attribute
# convenient way using internal function patterns()
melt(DT, id=1:2, measure=patterns("^f_", "^d_"), value.factor=TRUE)
# same as above, but provide list of columns directly by column names or indices
melt(DT, id=1:2, measure=list(3:4, c("d_1", "d_2")), value.factor=TRUE)
# na.rm=TRUE removes rows with NAs in any 'value' columns
melt(DT, id=1:2, measure=patterns("f_", "d_"), value.factor=TRUE, na.rm=TRUE)
# return 'NA' for missing columns, 'na.rm=TRUE' ignored due to list column
melt(DT, id=1:2, measure=patterns("l_", "c_"), na.rm=TRUE)
# }
Run the code above in your browser using DataLab