parallelism_choices: Function `parallelism_choices`

Description

List the types of supported parallel computing.

Usage

parallelism_choices(distributed_only = FALSE)

Arguments

distributed_only

logical, whether to return only the distributed backend types, such as Makefile and parLapply

Value

Character vector listing the types of parallel computing supported.

Details

Run make(..., parallelism = x, jobs = n) for any of the following values of x to distribute targets over parallel units of execution.

'parLapply'

launches multiple processes in a single R session using parallel::parLapply(). This is single-node, (potentially) multicore computing. It requires more overhead than the 'mclapply' option, but it works on Windows. If jobs is 1 in make(), then no 'cluster' is created and no parallelism is used.

'mclapply'

uses multiple processes in a single R session. This is single-node, (potentially) multicore computing. Does not work on Windows for jobs > 1 because mclapply() is based on forking.

'future_lapply'

opens up a whole trove of parallel backends powered by the future and future.batchtools packages. First, set the parallel backend globally using backend() (or equivalently, future::plan()). Then, apply the backend to your workplan using make(..., parallelism = "future_lapply", jobs = ...). But be warned: the environment for each target needs to be set up from scratch, so this backend type is higher overhead than either mclapply or parLapply. Also, the jobs argument only applies to the imports. for the max number of jobs to use for building targets, use options(mc.cores = jobs), or see ?future::future::.options for environment variables that set the max number of jobs.

'Makefile'

uses multiple R sessions by creating and running a Makefile. For distributed computing on a cluster or supercomputer, try make(..., parallelism = 'Makefile', prepend = 'SHELL=./shell.sh'). You need an auxiliary shell.sh file for this, and shell_file() writes an example.

Here, Makefile-level parallelism is only used for targets in your workflow plan data frame, not imports. To process imported objects and files, drake selects the best parallel backend for your system and uses the number of jobs you give to the jobs argument to make(). To use at most 2 jobs for imports and at most 4 jobs for targets, run make(..., parallelism = 'Makefile', jobs = 2, args = '--jobs=4')

Caution: the Makefile generated by make(..., parallelism = 'Makefile') is NOT standalone. DO NOT run it outside of make() or make(). Also, Windows users will need to download and install Rtools.

Examples

Run this code

# NOT RUN {
parallelism_choices()
parallelism_choices(distributed_only = TRUE)
# }