chrome_print: Print a web page to PDF or capture a screenshot using the headless Chrome

Description

Print an HTML page to PDF or capture a PNG/JPEG screenshot through the Chrome DevTools Protocol. Google Chrome (or Chromium on Linux) must be installed prior to using this function.

Usage

chrome_print(
  input,
  output = xfun::with_ext(input, format),
  wait = 2,
  browser = "google-chrome",
  format = c("pdf", "png", "jpeg"),
  options = list(),
  selector = "body",
  box_model = c("border", "content", "margin", "padding"),
  scale = 1,
  work_dir = tempfile(),
  timeout = 30,
  extra_args = c("--disable-gpu"),
  verbose = 0,
  async = FALSE,
  encoding
)

Arguments

input

A URL or local file path to an HTML page, or a path to a local file that can be rendered to HTML via rmarkdown::render() (e.g., an R Markdown document or an R script).

output

The output filename. For a local web page foo/bar.html, the default PDF output is foo/bar.pdf; for a remote URL https://www.example.org/foo/bar.html, the default output will be bar.pdf under the current working directory. The same rules apply for screenshots.

wait

The number of seconds to wait for the page to load before printing (in certain cases, the page may not be immediately ready for printing, especially there are JavaScript applications on the page, so you may need to wait for a longer time).

browser

Path to Google Chrome or Chromium. This function will try to find it automatically via find_chrome() if the path is not explicitly provided and the environment variable PAGEDOWN_CHROME is not set.

format

The output format.

options

A list of page options. See https://chromedevtools.github.io/devtools-protocol/tot/Page#method-printToPDF for the full list of options for PDF output, and https://chromedevtools.github.io/devtools-protocol/tot/Page#method-captureScreenshot for options for screenshots. Note that for PDF output, we have changed the defaults of printBackground (TRUE) and preferCSSPageSize (TRUE) in this function.

selector

A CSS selector used when capturing a screenshot.

box_model

The CSS box model used when capturing a screenshot.

scale

The scale factor used for screenshot.

work_dir

Name of headless Chrome working directory. If the default temporary directory doesn't work, you may try to use a subdirectory of your home directory.

timeout

The number of seconds before canceling the document generation. Use a larger value if the document takes longer to build.

extra_args

Extra command-line arguments to be passed to Chrome.

verbose

Level of verbosity: 0 means no messages; 1 means to print out some auxiliary messages (e.g., parameters for capturing screenshots); 2 (or TRUE) means all messages, including those from the Chrome processes and WebSocket connections.

async

Execute chrome_print() asynchronously? If TRUE, chrome_print() returns a promise value (the promises package has to be installed in this case).

encoding

Not used. This argument is required by RStudio IDE.

Value

Path of the output file (invisibly). If async is TRUE, this is a promise value.

References

https://developers.google.com/web/updates/2017/04/headless-chrome