Learn R Programming

future.batchtools (version 0.20.0)

batchtools_bash: A batchtools bash backend that resolves futures sequentially via a Bash template script

Description

The batchtools_bash backend was added to illustrate how to write a custom future.batchtools backend that uses a templated job script. Please see the source code, for details.

Usage

batchtools_bash(
  ...,
  template = "bash",
  fs.latency = 0,
  resources = list(),
  delete = getOption("future.batchtools.delete", "on-success")
)

makeClusterFunctionsBash(template = "bash", fs.latency = 0, ...)

Value

makeClusterFunctionsBash() returns a ClusterFunctions object.

Arguments

template

(optional) Name of job-script template to be searched for by batchtools::findTemplateFile(). If not found, it defaults to the templates/bash.tmpl part of this package (see below).

fs.latency

[numeric(1)]
Expected maximum latency of the file system, in seconds. Set to a positive number for network file systems like NFS which enables more robust (but also more expensive) mechanisms to access files and directories. Usually safe to set to 0 to disable the heuristic, e.g. if you are working on a local file system.

resources

(optional) A named list passed to the batchtools job-script template as variable resources. This is based on how batchtools::submitJobs() works, with the exception for specially reserved names defined by the future.batchtools package;

  • resources[["asis"]] is a character vector that are passed as-is to the job script and are injected as job resource declarations.

  • resources[["modules"]] is character vector of Linux environment modules to be loaded.

  • resources[["startup"]] and resources[["shutdown"]] are character vectors of shell code to be injected to the job script as-is.

  • resources[["details"]], if TRUE, results in the job script outputting job details and job summaries at the beginning and at the end.

  • All remaining resources named elements are injected as named resource specification for the scheduler.

delete

Controls if and when the batchtools job registry folder is deleted. If "on-success" (default), it is deleted if the future was resolved successfully and the expression did not produce an error. If "never", then it is never deleted. If "always", then it is always deleted.

...

Not used.

Details

Batchtools bash futures use batchtools cluster functions created by makeClusterFunctionsBash() and requires that bash is installed on the current machine and the timeout command is available.

The default template script templates/bash.tmpl can be found in:

system.file("templates", "bash.tmpl", package = "future.batchtools")

and comprise:

#!/bin/bash
######################################################################
# A batchtools launch script template
#
# Author: Henrik Bengtsson 
######################################################################

## Bash settings set -e # exit on error set -u # error on unset variables set -o pipefail # fail a pipeline if any command fails trap 'echo "ERROR: future.batchtools job script failed on line $LINENO" >&2; exit 1' ERR

## Redirect stdout and stderr to the batchtools log file exec > <%= log.file %> 2>&1

<% ## Maximum runtime? runtime <- resources[["timeout"]] resources[["timeout"]] <- NULL timeout <- if (is.null(runtime)) "" else sprintf("timeout %s", runtime) ## Shell "startup" code to evaluate startup <- resources[["startup"]] resources[["startup"]] <- NULL

## Shell "shutdown" code to evaluate shutdown <- resources[["shutdown"]] resources[["shutdown"]] <- NULL ## Environment modules specifications modules <- resources[["modules"]] resources[["modules"]] <- NULL %>

<% if (length(startup) > 0) { writeLines(startup) } %>

<% if (length(modules) > 0) { writeLines(c( 'echo "Load environment modules:"', sprintf('echo "- modules: %s"', paste(modules, collapse = ", ")), sprintf("module load %s", paste(modules, collapse = " ")), "module list" )) } %>

echo "Session information:" echo "- timestamp: $(date +"%Y-%m-%d %H:%M:%S%z")" echo "- hostname: $(hostname)" echo "- Rscript path: $(which Rscript)" echo "- Rscript version: $(Rscript --version)" echo "- Rscript library paths: $(Rscript -e "cat(shQuote(.libPaths()), sep = ' ')")" echo

# Launch R and evaluate the batchtools R job echo "Rscript -e 'batchtools::doJobCollection()' ..." echo "- job name: '<%= job.name %>'" echo "- job log file: '<%= log.file %>'" echo "- job uri: '<%= uri %>'" <%= timeout %> Rscript -e 'batchtools::doJobCollection("<%= uri %>")' res=$? echo " - exit code: ${res}" echo "Rscript -e 'batchtools::doJobCollection()' ... done" echo

<% if (length(shutdown) > 0) { writeLines(shutdown) } %>

echo "End time: $(date +"%Y-%m-%d %H:%M:%S%z")"

## Relay the exit code from Rscript exit "${res}"

Examples

Run this code
if (FALSE) { # interactive()
library(future)

# Limit runtime to 30 seconds per future
plan(future.batchtools::batchtools_bash, resources = list(runtime = 30))

message("Main process ID: ", Sys.getpid())

f <- future({
  data.frame(
    hostname = Sys.info()[["nodename"]],
          os = Sys.info()[["sysname"]],
       cores = unname(parallelly::availableCores()),
         pid = Sys.getpid(),
     modules = Sys.getenv("LOADEDMODULES")
  )
})
info <- value(f)
print(info)
}

Run the code above in your browser using DataLab