multiplyr (version 0.1.1)

reduce: Summarise data (with local reduction)

Description

summarise is used to summarise data on each node: reduce is then used to ensure that there's one overall summary

Usage

reduce(.self, ..., auto_compact = NULL)
reduce_(.self, ..., .dots, auto_compact = NULL)

Arguments

.self
Data frame
...
Additional parameters
auto_compact
Compact data after operation
.dots
Workaround for non-standard evaluation

Value

Data frame

Details

When data have not been grouped, calling summary(...) will result in each node summarising the data it has available. This means that if there are 3 nodes in the cluster, then there will be 3 summary values. reduce is used to bring all those together to a single value.

See Also

Other data manipulations: mutate, nsa, summarise, transmute, within_group, within_node

Examples

Run this code

dat <- Multiplyr (x = 1:100)
dat %>% summarise (N = length(x))
dat %>% shutdown()

dat <- Multiplyr (x = 1:100)
dat %>% summarise (N = length(x)) %>% reduce(N = sum(N))
dat %>% shutdown()

Run the code above in your browser using DataCamp Workspace