batchtools (version 0.9.12)

batchReduce: Reduce Operation for Batch Systems

Description

A parallel and asynchronous Reduce for batch systems. Note that this function only defines the computational jobs. Each job reduces a certain number of elements on one slave. The actual computation is started with submitJobs. Results and partial results can be collected with reduceResultsList, reduceResults or loadResult.

Usage

batchReduce(
  fun,
  xs,
  init = NULL,
  chunks = seq_along(xs),
  more.args = list(),
  reg = getDefaultRegistry()
)

Arguments

fun

[function(aggr, x, ...)] Function to reduce xs with.

xs

[vector] Vector to reduce.

init

[ANY] Initial object for reducing. See Reduce.

chunks

[integer(length(xs))] Group for each element of xs. Can be generated with chunk.

more.args

[list] A list of additional arguments passed to fun.

reg

[Registry] Registry. If not explicitly passed, uses the default registry (see setDefaultRegistry).

Value

[data.table] with ids of added jobs stored in column “job.id”.

See Also

batchMap

Examples

Run this code
# NOT RUN {
# define function to reduce on slave, we want to sum a vector
tmp = makeRegistry(file.dir = NA, make.default = FALSE)
xs = 1:100
f = function(aggr, x) aggr + x

# sum 20 numbers on each slave process, i.e. 5 jobs
chunks = chunk(xs, chunk.size = 5)
batchReduce(fun = f, 1:100, init = 0, chunks = chunks, reg = tmp)
submitJobs(reg = tmp)
waitForJobs(reg = tmp)

# now reduce one final time on master
reduceResults(fun = function(aggr, job, res) f(aggr, res), reg = tmp)
# }

Run the code above in your browser using DataCamp Workspace