clustermq workers.This function is like tar_make() except that targets
run in parallel with persistent clustermq workers. It requires
that you set global options like clustermq.scheduler and
clustermq.template inside the target script file
(default: _targets.R).
clustermq is not a strict dependency of targets,
so you must install clustermq yourself.
tar_make_clustermq(
names = NULL,
shortcut = targets::tar_config_get("shortcut"),
reporter = targets::tar_config_get("reporter_make"),
workers = targets::tar_config_get("workers"),
log_worker = FALSE,
callr_function = callr::r,
callr_arguments = targets::callr_args_default(callr_function, reporter),
envir = parent.frame(),
script = targets::tar_config_get("script"),
store = targets::tar_config_get("store")
)Names of the targets to build or check. Set to NULL to
check/build all the targets (default). Otherwise, you can supply
symbols, a character vector, or tidyselect helpers like
all_of() and starts_with().
Applies to ordinary targets (stem) and whole dynamic branching targets
(patterns) by not individual dynamic branches.
Logical of length 1, how to interpret the names argument.
If shortcut is FALSE (default) then the function checks
all targets upstream of names as far back as the dependency graph goes.
shortcut = TRUE increases speed if there are a lot of
up-to-date targets, but it assumes all the dependencies
are up to date, so please use with caution.
It relies on stored metadata for information about upstream dependencies.
shortcut = TRUE only works if you set names.
Character of length 1, name of the reporter to user.
Controls how messages are printed as targets run in the pipeline.
Defaults to tar_config_get("reporter_make"). Choices:
"verbose": print one message for each target that runs (default).
"silent": print nothing.
"timestamp": print a time-stamped message for each target that runs.
"summary": print a running total of the number of each targets in
each status category (queued, started, skipped, build, canceled,
or errored). Also show a timestamp ("%H:%M %OS2" strptime() format)
of the last time the progress changed and printed to the screen.
Positive integer, number of persistent clustermq workers
to create.
Logical, whether to write a log file for each worker.
Same as the log_worker argument of clustermq::Q()
and clustermq::workers().
A function from callr to start a fresh clean R
process to do the work. Set to NULL to run in the current session
instead of an external process (but restart your R session just before
you do in order to clear debris out of the global environment).
callr_function needs to be NULL for interactive debugging,
e.g. tar_option_set(debug = "your_target").
However, callr_function should not be NULL for serious
reproducible work.
A list of arguments to callr_function.
An environment, where to run the target R script
(default: _targets.R) if callr_function is NULL.
Ignored if callr_function is anything other than NULL.
callr_function should only be NULL for debugging and
testing purposes, not for serious runs of a pipeline, etc.
The envir argument of tar_make() and related
functions always overrides
the current value of tar_option_get("envir") in the current R session
just before running the target script file,
so whenever you need to set an alternative envir, you should always set
it with tar_option_set() from within the target script file.
In other words, if you call tar_option_set(envir = envir1) in an
interactive session and then
tar_make(envir = envir2, callr_function = NULL),
then envir2 will be used.
Character of length 1, path to the
target script file. Defaults to tar_config_get("script"),
which in turn defaults to _targets.R. When you set
this argument, the value of tar_config_get("script")
is temporarily changed for the current function call.
See tar_script(),
tar_config_get(), and tar_config_set() for details
about the target script file and how to set it
persistently for a project.
Character of length 1, path to the
targets data store. Defaults to tar_config_get("store"),
which in turn defaults to _targets/.
When you set this argument, the value of tar_config_get("store")
is temporarily changed for the current function call.
See tar_config_get() and tar_config_set() for details
about how to set the data store path persistently
for a project.
NULL except if callr_function = callr::r_bg(), in which case
a handle to the callr background process is returned. Either way,
the value is invisibly returned.
To use with a cluster, you will need to set the global options
clustermq.scheduler and clustermq.template inside the
target script file (default: _targets.R).
To read more about configuring clustermq for your scheduler, visit
https://mschubert.github.io/clustermq/articles/userguide.html#configuration # nolint
and navigate to the appropriate link under "Setting up the scheduler".
Wildcards in the template file are filled in with elements from
tar_option_get("resources").
Other pipeline:
tar_make_future(),
tar_make()
# NOT RUN {
if (!identical(tolower(Sys.info()[["sysname"]]), "windows")) {
if (identical(Sys.getenv("TAR_EXAMPLES"), "true")) {
tar_dir({ # tar_dir() runs code from a temporary directory.
tar_script({
options(clustermq.scheduler = "multicore") # Does not work on Windows.
tar_option_set()
list(tar_target(x, 1 + 1))
}, ask = FALSE)
tar_make_clustermq()
})
}
}
# }
Run the code above in your browser using DataLab