Connect and schedule pipelines
Connect and schedule pipelines
A list containing
id
the pipeline ID that can be used by deps
pipeline
forked pipeline instance
target_names
copy of names
depend_on
copy of deps
cue
copy of cue
standalone
copy of standalone
verbose
whether to verbose the build
root_path
path to the directory that contains pipelines and scheduler
collection_path
path to the pipeline collections
pipeline_ids
pipeline ID codes
new()
Constructor
PipelineCollections$new(root_path = NULL, overwrite = FALSE)
root_path
where to store the pipelines and intermediate results
overwrite
whether to overwrite if root_path
exists
add_pipeline()
Add pipeline into the collection
PipelineCollections$add_pipeline(
x,
names = NULL,
deps = NULL,
pre_hook = NULL,
post_hook = NULL,
cue = c("always", "thorough", "never"),
search_paths = pipeline_root(),
standalone = TRUE,
hook_envir = parent.frame()
)
x
a pipeline name (can be found via pipeline_list
),
or a PipelineTools
names
pipeline targets to execute
deps
pipeline IDs to depend on; see 'Values' below
pre_hook
function to run before the pipeline; the function needs two arguments: input map (can be edit in-place), and path to a directory that allows to store temporary files
post_hook
function to run after the pipeline; the function needs two arguments: pipeline object, and path to a directory that allows to store intermediate results
cue
whether to always run dependence
search_paths
where to search for pipeline if x
is a
character; ignored when x
is a pipeline object
standalone
whether the pipeline should be standalone, set to
TRUE
if the same pipeline added multiple times should run
independently; default is true
hook_envir
where to look for global environments if pre_hook
or post_hook
contains global variables; default is the calling
environment
build_pipelines()
Build pipelines and visualize
PipelineCollections$build_pipelines(visualize = TRUE)
visualize
whether to visualize the pipeline; default is true
run()
Run the collection of pipelines
PipelineCollections$run(
error = c("error", "warning", "ignore"),
.scheduler = c("none", "future", "clustermq"),
.type = c("callr", "smart", "vanilla"),
.as_promise = FALSE,
.async = FALSE,
rebuild = NA,
...
)
error
what to do when error occurs; default is 'error'
throwing errors; other choices are 'warning'
and 'ignore'
.scheduler, .type, .as_promise, .async, ...
passed to
pipeline_run
rebuild
whether to re-build the pipeline; default is NA
(
if the pipeline has been built before, then do not rebuild)
get_scheduler()
Get scheduler
object
PipelineCollections$get_scheduler()