parallelism_choices
List the types of supported parallel computing.
parallelism_choices(distributed_only = FALSE)
logical, whether to return only
the distributed backend types, such as Makefile
and
parLapply
Character vector listing the types of parallel computing supported.
Run make(..., parallelism = x, jobs = n)
for any of
the following values of x
to distribute targets over parallel
units of execution.
launches multiple processes in a single R session
using parallel::parLapply()
.
This is single-node, (potentially) multicore computing.
It requires more overhead than the 'mclapply'
option,
but it works on Windows. If jobs
is 1
in
make()
, then no 'cluster' is created and
no parallelism is used.
uses multiple processes in a single R session.
This is single-node, (potentially) multicore computing.
Does not work on Windows for jobs > 1
because mclapply()
is based on forking.
opens up a whole trove of parallel backends
powered by the future
and future.batchtools
packages. First, set the parallel backend globally using
backend()
(or equivalently, future::plan()
).
Then, apply the backend to your workplan
using make(..., parallelism = "future_lapply", jobs = ...)
.
But be warned: the environment for each target needs to be set up
from scratch, so this backend type is higher overhead than either
mclapply
or parLapply
.
Also, the jobs
argument only applies to the imports.
for the max number of jobs to use for building targets,
use options(mc.cores = jobs), or see ?future::future::.options
for environment variables that set the max number of jobs.
uses multiple R sessions
by creating and running a Makefile.
For distributed computing on a cluster or supercomputer,
try make(..., parallelism = 'Makefile',
prepend = 'SHELL=./shell.sh')
.
You need an auxiliary shell.sh
file for this,
and shell_file()
writes an example.
Here, Makefile-level parallelism is only used for
targets in your workflow plan
data frame, not imports. To process imported objects and files,
drake selects the best parallel
backend for your system and uses
the number of jobs you give to the jobs
argument to make()
.
To use at most 2 jobs for imports and at most 4 jobs
for targets, run
make(..., parallelism = 'Makefile', jobs = 2, args = '--jobs=4')
Caution: the Makefile generated by
make(..., parallelism = 'Makefile')
is NOT standalone. DO NOT run it outside of
make()
or make()
.
Also, Windows users will need to download and install Rtools.
# NOT RUN {
parallelism_choices()
parallelism_choices(distributed_only = TRUE)
# }
Run the code above in your browser using DataLab