furrr (version 0.2.3)

furrr_options: Options to fine tune furrr

Description

These options fine tune furrr functions, such as future_map(). They are either used by furrr directly, or are passed on to future::future().

Usage

furrr_options(
  ...,
  stdout = TRUE,
  conditions = NULL,
  globals = TRUE,
  packages = NULL,
  lazy = FALSE,
  seed = FALSE,
  scheduling = 1,
  chunk_size = NULL,
  prefix = NULL
)

Arguments

...

These dots are reserved for future extensibility and must be empty.

stdout

A logical.

  • If TRUE, standard output of the underlying futures is captured and relayed as soon as possible.

  • If FALSE, output is silenced by sinking it to the null device.

  • If NA, output is not intercepted. This is not recommended.

conditions

A character string of conditions classes to be captured and relayed. The default is the same as the condition argument of future::Future(). To not intercept conditions, use conditions = character(0L). Errors are always relayed.

globals

A logical, a character vector, a named list, or NULL for controlling how globals are handled. For details, see the Global variables section below.

packages

A character vector, or NULL. If supplied, this specifies packages that are guaranteed to be attached in the R environment where the future is evaluated.

lazy

A logical. Specifies whether futures should be resolved lazily or eagerly.

seed

A logical, an integer of length 1 or 7, a list of length(.x) with pre-generated random seeds, or NULL. For details, see the Reproducible random number generation (RNG) section below.

scheduling

A single integer, logical, or Inf. This argument controls the average number of futures ("chunks") per worker.

  • If 0, then a single future is used to process all elements of .x.

  • If 1 or TRUE, then one future per worker is used.

  • If 2, then each worker will process two futures (provided there are enough elements in .x).

  • If Inf or FALSE, then one future per element of .x is used.

This argument is only used if chunk_size is NULL.

chunk_size

A single integer, Inf, or NULL. This argument controls the average number of elements per future ("chunk"). If Inf, then all elements are processed in a single future. If NULL, then scheduling is used instead to determine how .x is chunked.

prefix

A single character string, or NULL. If a character string, then each future is assigned a label as {prefix}-{chunk-id}. If NULL, no labels are used.

Global variables

globals controls how globals are identified, similar to the globals argument of future::future(). Since all function calls use the same set of globals, furrr gathers globals upfront (once), which is more efficient than if it was done for each future independently.

  • If TRUE or NULL, then globals are automatically identified and gathered.

  • If a character vector of names is specified, then those globals are gathered.

  • If a named list, then those globals are used as is.

  • In all cases, .f and any ... arguments are automatically passed as globals to each future created, as they are always needed.

Reproducible random number generation (RNG)

Unless seed = FALSE, furrr functions are guaranteed to generate the exact same sequence of random numbers given the same initial seed / RNG state regardless of the type of futures and scheduling ("chunking") strategy.

Setting seed = NULL is equivalent to seed = FALSE, except that the future.rng.onMisuse option is not consulted to potentially monitor the future for faulty random number usage. See the seed argument of future::future() for more details.

RNG reproducibility is achieved by pre-generating the random seeds for all iterations (over .x) by using L'Ecuyer-CMRG RNG streams. In each iteration, these seeds are set before calling .f(.x[[i]], ...). Note, for large length(.x) this may introduce a large overhead.

A fixed seed may be given as an integer vector, either as a full L'Ecuyer-CMRG RNG seed of length 7, or as a seed of length 1 that will be used to generate a full L'Ecuyer-CMRG seed.

If seed = TRUE, then .Random.seed is returned if it holds a L'Ecuyer-CMRG RNG seed, otherwise one is created randomly.

If seed = NA, a L'Ecuyer-CMRG RNG seed is randomly created.

If none of the function calls .f(.x[[i]], ...) use random number generation, then seed = FALSE may be used.

In addition to the above, it is possible to specify a pre-generated sequence of RNG seeds as a list such that length(seed) == length(.x) and where each element is an integer seed that can be assigned to .Random.seed. Use this alternative with caution. Note that as.list(seq_along(.x)) is not a valid set of such .Random.seed values.

In all cases but seed = FALSE, after a furrr function returns, the RNG state of the calling R process is guaranteed to be "forwarded one step" from the RNG state before the call. This is true regardless of the future strategy / scheduling used. This is done in order to guarantee that an R script calling future_map() multiple times should be numerically reproducible given the same initial seed.

Examples

Run this code
# NOT RUN {
furrr_options()
# }

Run the code above in your browser using DataLab