Learn R Programming

shard (version 0.1.0)

stream_group_sum: Stream group-wise sum

Description

Computes sum(value) by group across partitions without collecting. This is optimized for factor groups (factor_col()).

Usage

stream_group_sum(x, group, value, na_rm = TRUE)

Value

A data.frame with columns group (factor) and sum (numeric).

Arguments

x

A shard_row_groups or shard_dataset handle.

group

Group column name (recommended: factor_col()).

value

Numeric column name to sum.

na_rm

Logical; drop rows where value is NA (default TRUE).

Examples

Run this code
# \donttest{
s <- schema(g = factor_col(c("a", "b")), x = float64())
sink <- table_sink(s, mode = "row_groups")
table_write(sink, 1L,
  data.frame(g = factor(c("a", "b", "a"), levels = c("a", "b")), x = c(1, 2, 3)))
rg <- table_finalize(sink)
stream_group_sum(rg, "g", "x")
# }

Run the code above in your browser using DataLab