summarise
is used to summarise data on each node: reduce
is then used to ensure that there's one overall summary
reduce(.self, ..., auto_compact = NULL)
reduce_(.self, ..., .dots, auto_compact = NULL)
summary(...)
will result
in each node summarising the data it has available. This means that if
there are 3 nodes in the cluster, then there will be 3 summary values.
reduce
is used to bring all those together to a single value.
mutate
,
nsa
, summarise
,
transmute
, within_group
,
within_node
dat <- Multiplyr (x = 1:100)
dat %>% summarise (N = length(x))
dat %>% shutdown()
dat <- Multiplyr (x = 1:100)
dat %>% summarise (N = length(x)) %>% reduce(N = sum(N))
dat %>% shutdown()
Run the code above in your browser using DataLab