plan
argument of make()
.A drake
plan is a data frame with columns
"target"
and "command"
. Each target is an R object
produced in your workflow, and each command is the
R code to produce it.
drake_plan(..., list = character(0), file_targets = NULL,
strings_in_dots = NULL, tidy_evaluation = NULL, transform = TRUE,
trace = FALSE, envir = parent.frame(), tidy_eval = TRUE,
max_expand = NULL)
A collection of symbols/targets with commands assigned to them. See the examples for details.
Deprecated
Deprecated.
Deprecated.
Deprecated. Use tidy_eval
instead.
Logical, whether to transform the plan
into a larger plan with more targets.
Requires the transform
field in
target()
. See the examples for details.
Logical, whether to add columns to show what happens during target transformations.
Environment for tidy evaluation.
Logical, whether to use tidy evaluation
(e.g. unquoting/!!
) when resolving commands.
Tidy evaluation in transformations is always turned on
regardless of the value you supply to this argument.
Positive integer, optional upper bound on the lengths
of grouping variables for map()
and cross()
. Comes in handy
when you have a massive number of targets and you want to test
on a miniature version of your workflow before you scale up
to production.
A data frame of targets, commands, and optional custom columns.
drake_plan()
creates a special data frame. At minimum, that data frame
must have columns target
and command
with the target names and the
R code chunks to build them, respectively.
You can add custom columns yourself, either with target()
(e.g. drake_plan(targ = target(my_cmd(), custom = "column"))
)
or by appending columns post-hoc (e.g. plan$col <- vals
).
Some of these custom columns are special. They are optional,
but drake
looks for them at various points in the workflow.
elapsed
and cpu
: number of seconds to wait for the target to build
before timing out (elapsed
for elapsed time and cpu
for CPU time).
hpc
: logical values (TRUE
/FALSE
/NA
) whether to send each target
to parallel workers.
Visit https://ropenscilabs.github.io/drake-manual/hpc.html#selectivity
to learn more.
resources
: target-specific lists of resources for a computing cluster.
See
https://ropenscilabs.github.io/drake-manual/hpc.html#advanced-options
for details.
retries
: number of times to retry building a target
in the event of an error.
seed
: an optional pseudo-random number generator (RNG)
seed for each target. drake
usually comes up with its own
unique reproducible target-specific seeds using the global seed
(the seed
argument to make()
and drake_config()
)
and the target names, but you can overwrite these automatic seeds.
NA
entries default back to drake
's automatic seeds.
trigger
: rule to decide whether a target needs to run.
It is recommended that you define this one with target()
.
Details: https://ropenscilabs.github.io/drake-manual/triggers.html.
drake_plan()
understands special keyword functions for your commands.
With the exception of target()
, each one is a proper function
with its own help file.
target()
: declare more than just the command,
e.g. assign a trigger or transform.
Examples: https://ropenscilabs.github.io/drake-manual/plans.html#large-plans. # nolint
file_in()
: declare an input file dependency.
file_out()
: declare an output file to be produced
when the target is built.
knitr_in()
: declare a knitr
file dependency such as an
R Markdown (*.Rmd
) or R LaTeX (*.Rnw
) file.
ignore()
: force drake
to entirely ignore a piece of code:
do not track it for changes and do not analyze it for dependencies.
no_deps()
: tell drake
to not track the dependencies
of a piece of code. drake
still tracks the code itself for changes.
drake_envir()
: get the environment where drake builds targets.
Intended for advanced custom memory management.
drake
has special syntax for generating large plans.
Your code will look something like
drake_plan(x = target(cmd, transform = f(y, z), group = g)
where f()
is either map()
, cross()
, split()
, or combine()
(similar to purrr::pmap()
, tidy::crossing()
, base::split()
,
and dplyr::summarize()
, respectively).
These verbs mimic Tidyverse behavior to scale up
existing plans to large numbers of targets.
You can read about this interface at
https://ropenscilabs.github.io/drake-manual/plans.html#large-plans. # nolint
Besides "target"
and "command"
, drake_plan()
understands a special set of optional columns. For details, visit
https://ropenscilabs.github.io/drake-manual/plans.html#special-custom-columns-in-your-plan
# NOT RUN {
isolate_example("contain side effects", {
# For more examples, visit
# https://ropenscilabs.github.io/drake-manual/plans.html.
# Create drake plans:
mtcars_plan <- drake_plan(
write.csv(mtcars[, c("mpg", "cyl")], file_out("mtcars.csv")),
value = read.csv(file_in("mtcars.csv"))
)
mtcars_plan
make(mtcars_plan) # Makes `mtcars.csv` and then `value`
head(readd(value))
# You can use knitr inputs too. See the top command below.
load_mtcars_example()
head(my_plan)
# The `knitr_in("report.Rmd")` tells `drake` to dive into the active
# code chunks to find dependencies.
# There, `drake` sees that `small`, `large`, and `coef_regression2_small`
# are loaded in with calls to `loadd()` and `readd()`.
deps_code("report.Rmd")
# Use transformations to generate large plans.
# Read more at
# <https://ropenscilabs.github.io/drake-manual/plans.html#create-large-plans-the-easy-way>. # nolint
drake_plan(
data = target(
simulate(nrows),
transform = map(nrows = c(48, 64)),
custom_column = 123
),
reg = target(
reg_fun(data),
transform = cross(reg_fun = c(reg1, reg2), data)
),
summ = target(
sum_fun(data, reg),
transform = cross(sum_fun = c(coef, residuals), reg)
),
winners = target(
min(summ),
transform = combine(summ, .by = c(data, sum_fun))
)
)
# Split data among multiple targets.
drake_plan(
large_data = get_data(),
slice_analysis = target(
large_data %>%
analyze(),
transform = split(large_data, slices = 4)
),
results = target(
rbind(slice_analysis),
transform = combine(slice_analysis)
)
)
# Set trace = TRUE to show what happened during the transformation process.
drake_plan(
data = target(
simulate(nrows),
transform = map(nrows = c(48, 64)),
custom_column = 123
),
reg = target(
reg_fun(data),
transform = cross(reg_fun = c(reg1, reg2), data)
),
summ = target(
sum_fun(data, reg),
transform = cross(sum_fun = c(coef, residuals), reg)
),
winners = target(
min(summ),
transform = combine(summ, .by = c(data, sum_fun))
),
trace = TRUE
)
# You can create your own custom columns too.
# See ?triggers for more on triggers.
drake_plan(
website_data = target(
command = download_data("www.your_url.com"),
trigger = "always",
custom_column = 5
),
analysis = analyze(website_data)
)
# Tidy evaluation can help generate super large plans.
sms <- rlang::syms(letters) # To sub in character args, skip this.
drake_plan(x = target(f(char), transform = map(char = !!sms)))
})
# }
Run the code above in your browser using DataLab