future (version 1.16.0)

cluster: Create a cluster future whose value will be resolved asynchronously in a parallel process

Description

A cluster future is a future that uses cluster evaluation, which means that its value is computed and resolved in parallel in another process.

Usage

cluster(
  expr,
  envir = parent.frame(),
  substitute = TRUE,
  lazy = FALSE,
  seed = NULL,
  globals = TRUE,
  persistent = FALSE,
  workers = availableWorkers(),
  user = NULL,
  revtunnel = TRUE,
  homogeneous = TRUE,
  gc = FALSE,
  earlySignal = FALSE,
  label = NULL,
  ...
)

Arguments

expr
envir

The environment from where global objects should be identified.

substitute

If TRUE, argument expr is substitute():ed, otherwise not.

lazy

If FALSE (default), the future is resolved eagerly (starting immediately), otherwise not.

seed

(optional) If TRUE, the random seed, that is, the state of the random number generator (RNG) will be set such that statistically sound random numbers are produced (also during parallelization). If FALSE, it is assumed that the future expression does neither need nor use random numbers generation. To use a fixed random seed, specify a L'Ecuyer-CMRG seed (seven integer) or a regular RNG seed (a single integer). Furthermore, if FALSE, then the future will be monitored to make sure it does not use random numbers. If it does and depending on the value of option future.rng.misUse, the check is ignored, an informative warning, or error will be produced. If seed is NULL (default), then the effect is as with seed = FALSE but without the RNG check being performed.

globals

(optional) a logical, a character vector, or a named list to control how globals are handled. For details, see section 'Globals used by future expressions' in the help for future().

persistent

If FALSE, the evaluation environment is cleared from objects prior to the evaluation of the future.

workers

A cluster object, a character vector of host names, a positive numeric scalar, or a function. If a character vector or a numeric scalar, a cluster object is created using makeClusterPSOCK(workers). If a function, it is called without arguments when the future is created and its value is used to configure the workers. The function should return any of the above types.

user

(optional) The user name to be used when communicating with another host.

revtunnel

If TRUE, reverse SSH tunneling is used for the PSOCK cluster nodes to connect back to the master R process. This avoids the hassle of firewalls, port forwarding and having to know the internal / public IP address of the master R session.

homogeneous

If TRUE, all cluster nodes is assumed to use the same path to Rscript as the main R session. If FALSE, the it is assumed to be on the PATH for each node.

gc

If TRUE, the garbage collector run (in the process that evaluated the future) only after the value of the future is collected. Exactly when the values are collected may depend on various factors such as number of free workers and whether earlySignal is TRUE (more frequently) or FALSE (less frequently). Some types of futures ignore this argument.

earlySignal

Specified whether conditions should be signaled as soon as possible or not.

label

An optional character string label attached to the future.

Additional named elements passed to ClusterFuture().

Value

A ClusterFuture.

Details

This function will block if all available R cluster nodes are occupied and will be unblocked as soon as one of the already running cluster futures is resolved.

The preferred way to create an cluster future is not to call this function directly, but to register it via plan(cluster) such that it becomes the default mechanism for all futures. After this future() and %<-% will create cluster futures.

Examples

Run this code
# NOT RUN {
## Use cluster futures
cl <- parallel::makeCluster(2L, timeout = 60)
plan(cluster, workers = cl)

## A global variable
a <- 0

## Create future (explicitly)
f <- future({
  b <- 3
  c <- 2
  a * b * c
})

## A cluster future is evaluated in a separate process.
## Regardless, changing the value of a global variable will
## not affect the result of the future.
a <- 7
print(a)

v <- value(f)
print(v)
stopifnot(v == 0)

## CLEANUP
parallel::stopCluster(cl)

# }

Run the code above in your browser using DataCamp Workspace