data.table (version 1.12.0)

groupingsets: Grouping Set aggregation for data tables

Description

Calculate aggregates at various levels of groupings producing multiple (sub-)totals. Reflects SQLs GROUPING SETS operations.

Usage

rollup(x, …)
# S3 method for data.table
rollup(x, j, by, .SDcols, id = FALSE, …)
cube(x, …)
# S3 method for data.table
cube(x, j, by, .SDcols, id = FALSE, …)
groupingsets(x, …)
# S3 method for data.table
groupingsets(x, j, by, sets, .SDcols, id = FALSE, jj, …)

Arguments

x

data.table.

argument passed to custom user methods. Ignored for data.table methods.

j

expression passed to data.table j.

by

character column names by which we are grouping.

sets

list of character vector reflecting grouping sets, used in groupingsets for flexibility.

.SDcols

columns to be used in j expression in .SD object.

id

logical default FALSE. If TRUE it will add leading column with bit mask of grouping sets.

jj

quoted version of j argument, for convenience. When provided function will ignore j argument.

Value

A data.table with various aggregates.

Details

All three functions rollup, cube, groupingsets are generic methods, data.table methods are provided.

References

http://www.postgresql.org/docs/9.5/static/queries-table-expressions.html#QUERIES-GROUPING-SETS http://www.postgresql.org/docs/9.5/static/functions-aggregate.html#FUNCTIONS-GROUPING-TABLE

See Also

data.table, rbindlist

Examples

Run this code
# NOT RUN {
n = 24L
set.seed(25)
DT <- data.table(
    color = sample(c("green","yellow","red"), n, TRUE),
    year = as.Date(sample(paste0(2011:2015,"-01-01"), n, TRUE)),
    status = as.factor(sample(c("removed","active","inactive","archived"), n, TRUE)),
    amount = sample(1:5, n, TRUE),
    value = sample(c(3, 3.5, 2.5, 2), n, TRUE)
)

# rollup
rollup(DT, j = sum(value), by = c("color","year","status")) # default id=FALSE
rollup(DT, j = sum(value), by = c("color","year","status"), id=TRUE)
rollup(DT, j = lapply(.SD, sum), by = c("color","year","status"), id=TRUE, .SDcols="value")
rollup(DT, j = c(list(count=.N), lapply(.SD, sum)), by = c("color","year","status"), id=TRUE)

# cube
cube(DT, j = sum(value), by = c("color","year","status"), id=TRUE)
cube(DT, j = lapply(.SD, sum), by = c("color","year","status"), id=TRUE, .SDcols="value")
cube(DT, j = c(list(count=.N), lapply(.SD, sum)), by = c("color","year","status"), id=TRUE)

# groupingsets
groupingsets(DT, j = c(list(count=.N), lapply(.SD, sum)), by = c("color","year","status"),
             sets = list("color", c("year","status"), character()), id=TRUE)
# }

Run the code above in your browser using DataCamp Workspace