make_async_evaluator: Create Asynchronous Evaluator to Queue Tasks

Description

Asynchronous evaluator aims at queuing R evaluations from sub-processes without blocking the main session. It's based on 'parallel' and 'future' packages.

Usage

make_async_evaluator(name, path = tempfile(), n_nodes = 1,
  n_subnodes = future::availableCores() - 1, verbose = FALSE, ...)

Arguments

name

unique name for the evaluator

path

blank directory for evaluator to store data

n_nodes

number of control nodes, default is 1

n_subnodes

number of sub-sessions for each control node, default is the number of CPU cores minus 1

verbose

for internal debug use

...

passed to the constructor of MasterEvaluator

Value

A list of functions to control the evaluator:

run(expr, success = NULL, failure = NULL, priority = 0, persist = FALSE, quoted = FALSE, ..., .list = NULL)

Queue and run an R expression.

expr

can be anything except for q(), which terminates the session. 'rlang' quasiquotation is also supported. For example, you can use `!!` to quasi-quote the expression and unquote the values.

..., .list

provides additional data for expr. For example, expr uses a large data object dat in the main session, which might not be available to the child sessions. Also because the object is large, quasi-quotation could be slow or fail. By passing dat=... or .list=list(dat=...), it's able to temporary store the data on hard-drive and persist for evaluators. The back-end is using qs_map, which is super fast for data that are no more than 2GB.

success and failure

functions to handle the results once the evaluator returns the value. Since it's almost impossible to know when the evaluator returns values, it's recommended that these functions to be simple.

priority

puts the priority of the expression. It can only be `0` or `1`. Evaluators will run expressions with priority equal to 1 first.

persist

indicates whether to run the expression and persist intermediate variables.

terminate()

Shut down and release all the resource.

scale_down(n_nodes, n_subnodes = 1),

scale_up(n_nodes, n_subnodes = 1,
      create_if_missing = FALSE, path = tempfile())

Scale down or up the evaluator.

n_nodes and n_subnodes: see 'usage'
create_if_missing: If the evaluator was previously terminated or shutdown, setting this to be true ignores the `invalid` flags and re-initialize the evaluator
path: If create_if_missing is true, then path will be passed to the constructor of MasterEvaluator. See 'usage'.

workers(...)

Returns number of workers available in the evaluator. `...` is for debug use

progress()

Returns a vector of 4 integers. They are:

The total number evaluations.
Number of running evaluations.
Number of awaiting evaluations.
Number of finished evaluations.

Details

'parallel' blocks the main session when evaluating expressions. 'future' blocks the main session when the number of running futures exceed the maximum number of workers. (For example if 4 workers are planned, then running 5 future instances at the same time will freeze the session).

Asynchronous evaluator is designed to queue any number of R expressions without blocking the main session. The incoming expressions are stored in AbstractQueue instances, and main session monitors the queue and is charge of notifying child sessions to evaluate these expressions whenever available.

Important: Asynchronous evaluator is not designed for super high-performance computing. The internal scheduler schedules n_nodes evaluations for every 1 second. Therefore if each of the process can be finished within 1 / n_nodes seconds, then use `future` instead.