drake (version 6.2.1)

sankey_drake_graph: Show a Sankey graph of your drake project.

Description

To save time for repeated plotting, this function is divided into drake_graph_info() and render_sankey_drake_graph(). A legend is unfortunately unavailable for the graph itself (https://github.com/christophergandrud/networkD3/issues/240) but you can see what all the colors mean with visNetwork::visNetwork(drake::legend_nodes()).

Usage

sankey_drake_graph(config = drake::read_drake_config(),
  file = character(0), selfcontained = FALSE, build_times = "build",
  digits = 3, targets_only = FALSE, from = NULL, mode = c("out",
  "in", "all"), order = NULL, subset = NULL, make_imports = TRUE,
  from_scratch = FALSE, group = NULL, clusters = NULL,
  show_output_files = TRUE, ...)

Arguments

config

a drake_config() configuration list. You can get one as a return value from make() as well.

file

Name of a file to save the graph. If NULL or character(0), no file is saved and the graph is rendered and displayed within R. If the file ends in a .png, .jpg, .jpeg, or .pdf extension, then a static image will be saved. In this case, the webshot package and PhantomJS are required: install.packages("webshot"); webshot::install_phantomjs(). If the file does not end in a .png, .jpg, .jpeg, or .pdf extension, an HTML file will be saved, and you can open the interactive graph using a web browser.

selfcontained

logical, whether to save the file as a self-contained HTML file (with external resources base64 encoded) or a file with external resources placed in an adjacent directory. If TRUE, pandoc is required.

build_times

character string or logical. If character, the choices are 1. "build": runtime of the command plus the time it take to store the target or import. 2. "command": just the runtime of the command. 3. "none": no build times. If logical, build_times selects whether to show the times from `build_times(..., type = "build")`` or use no build times at all. See build_times() for details.

digits

number of digits for rounding the build times

targets_only

logical, whether to skip the imports and only include the targets in the workflow plan.

from

Optional collection of target/import names. If from is nonempty, the graph will restrict itself to a neighborhood of from. Control the neighborhood with mode and order.

mode

Which direction to branch out in the graph to create a neighborhood around from. Use "in" to go upstream, "out" to go downstream, and "all" to go both ways and disregard edge direction altogether.

order

How far to branch out to create a neighborhood around from. Defaults to as far as possible. If a target is in the neighborhood, then so are all of its custom file_out() files if show_output_files is TRUE. That means the actual graph order may be slightly greater than you might expect, but this ensures consistency between show_output_files = TRUE and show_output_files = FALSE.

subset

Optional character vector. Subset of targets/imports to display in the graph. Applied after from, mode, and order. Be advised: edges are only kept for adjacent nodes in subset. If you do not select all the intermediate nodes, edges will drop from the graph.

make_imports

logical, whether to make the imports first. Set to FALSE to increase speed and risk using obsolete information.

from_scratch

logical, whether to assume all the targets will be made from scratch on the next make(). Makes all targets outdated, but keeps information about build progress in previous make()s.

group

optional character scalar, name of the column used to group nodes into columns. All the columns names of your config$plan are choices. The other choices (such as "status") are column names in the nodes . To group nodes into clusters in the graph, you must also supply the clusters argument.

clusters

optional character vector of values to cluster on. These values must be elements of the column of the nodes data frame that you specify in the group argument to drake_graph_info().

show_output_files

logical, whether to include file_out() files in the graph.

...

arguments passed to networkD3::sankeyNetwork().

Value

A visNetwork graph.

See Also

render_sankey_drake_graph(), vis_drake_graph(), drake_ggraph()

Examples

Run this code
# NOT RUN {
test_with_dir("Quarantine side effects.", {
load_mtcars_example() # Get the code with drake_example("mtcars").
config <- drake_config(my_plan)
# Plot the network graph representation of the workflow.
sankey_drake_graph(config, width = '100%') # The width is passed to visNetwork
# Show the legend separately.
visNetwork::visNetwork(nodes = drake::legend_nodes())
make(my_plan) # Run the project, build the targets.
sankey_drake_graph(config) # The black nodes from before are now green.
# Plot a subgraph of the workflow.
sankey_drake_graph(config, from = c("small", "reg2"))
# Optionally visualize clusters.
config$plan$large_data <- grepl("large", config$plan$target)
sankey_drake_graph(
  config, group = "large_data", clusters = c(TRUE, FALSE))
# You can even use clusters given to you for free in the `graph$nodes`
# data frame of `drake_graph_info()`.
sankey_drake_graph(
  config, group = "status", clusters = "imported")
})
# }

Run the code above in your browser using DataCamp Workspace