Learn R Programming

⚠️There's a newer version (0.9.18) of this package.Take me there.

batchtools

As a successor of the packages BatchJobs and BatchExperiments, batchtools provides a parallel implementation of Map for high performance computing systems managed by schedulers like Slurm, Sun Grid Engine, OpenLava, TORQUE/OpenPBS, Load Sharing Facility (LSF) or Docker Swarm (see the Setup vignette).

The main features conclude:

  • Convenience: All relevant batch system operations (submitting, listing, killing) are either handled internally or abstracted via simple R functions
  • Portability: With a well-defined interface, the source is independent from the underlying batch system - prototype locally, deploy on any high performance cluster
  • Reproducibility: Every computational part has an associated seed stored in a data base which ensures reproducibility even when the underlying batch system changes
  • Abstraction: The code layers for algorithms, experiment definitions and execution are cleanly separated and allow to write readable and maintainable code to manage large scale computer experiments

Installation

Install the stable version from CRAN:

install.packages("batchtools")

For the development version, use devtools:

devtools::install_github("mllg/batchtools")

Why batchtools?

The development of BatchJobs and BatchExperiments is discontinued for the following reasons:

  • Maintainability: The packages BatchJobs and BatchExperiments are tightly connected which makes maintenance difficult. Changes have to be synchronized and tested against the current CRAN versions for compatibility. Furthermore, BatchExperiments violates CRAN policies by calling internal functions of BatchJobs.
  • Data base issues: Although we invested weeks to mitigate issues with locks of the SQLite data base or file system (staged queries, file system timeouts, ...), BatchJobs kept working unreliable on some systems with high latency or specific file systems. This made BatchJobs unusable for many users.

BatchJobs and BatchExperiments will remain on CRAN, but new features are unlikely to be ported back. See this vignette for a comparison of the packages.

Resources

Citation

Please cite the JOSS paper using the following BibTeX entry:

@article{,
  doi = {10.21105/joss.00135},
  url = {https://doi.org/10.21105/joss.00135},
  year  = {2017},
  month = {feb},
  publisher = {The Open Journal},
  volume = {2},
  number = {10},
  author = {Michel Lang and Bernd Bischl and Dirk Surmann},
  title = {batchtools: Tools for R to work on batch systems},
  journal = {The Journal of Open Source Software}
}

Related Software

  • The High Performance Computing Task View lists the most relevant packages for scientific computing with R
  • batch assists in splitting and submitting jobs to LSF and MOSIX clusters
  • flowr supports LSF, Slurm, TORQUE and Moab and provides a scatter-gather approach to define computational jobs

Contributing to batchtools

This R package is licensed under the LGPL-3. If you encounter problems using this software (lack of documentation, misleading or wrong documentation, unexpected behaviour, bugs, ...) or just want to suggest features, please open an issue in the issue tracker. Pull requests are welcome and will be included at the discretion of the author. If you have customized a template file for your (larger) computing site, please share it: fork the repository, place your template in inst/templates and send a pull request.

Copy Link

Version

Install

install.packages('batchtools')

Monthly Downloads

15,832

Version

0.9.3

License

LGPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Michel Lang

Last Published

April 21st, 2017

Functions in batchtools (0.9.3)

JoinTables

Inner, Left, Right, Outer, Semi and Anti Join for Data Tables
Tags

Add or Remove Job Tags
makeJobCollection

JobCollection Constructor
makeJob

Jobs and Experiments
addExperiments

Add Experiments to the Registry
addProblem

Define Problems for Experiments
batchExport

Export Objects to the Slaves
batchMap

Map Operation for Batch Systems
Worker

Create a Linux-Worker
addAlgorithm

Define Algorithms for Experiments
batchMapResults

Map Over Results to Create New Jobs
batchReduce

Reduce Operation for Batch Systems
cfHandleUnknownSubmitError

Cluster Functions Helper to Handle Unknown Errors
cfKillJob

Cluster Functions Helper to Kill Batch Jobs
batchtools-deprecated

Deprecated function in the batchtools package
chunkIds

Chunk Jobs for Sequential Execution
clearRegistry

Remove All Jobs
loadRegistry

Load a Registry from the File System
makeClusterFunctions

ClusterFunctions Constructor
makeClusterFunctionsDocker

ClusterFunctions for Docker
makeClusterFunctionsTORQUE

ClusterFunctions for OpenPBS/TORQUE Systems
batchtools-package

batchtools: Tools for Computation on Batch Systems
makeExperimentRegistry

ExperimentRegistry Constructor
resetJobs

Reset the Computational State of Jobs
runHook

Trigger Evaluation of Custom Function
btlapply

Synchronous Apply Functions
cfBrewTemplate

Cluster Functions Helper to Write Job Description Files
getDefaultRegistry

Get and Set the Default Registry
getErrorMessages

Retrieve Error Messages
loadResult

Load the Result of a Single Job
makeClusterFunctionsSlurm

ClusterFunctions for Slurm Systems
makeClusterFunctionsSocket

ClusterFunctions for Parallel Socket Execution
summarizeExperiments

Quick Summary over Experiments
makeClusterFunctionsSGE

ClusterFunctions for SGE Systems
makeClusterFunctionsSSH

ClusterFunctions for Remote SSH Execution
showLog

Inspect Log Files
submitJobs

Submit Jobs to the Batch Systems
cfReadBrewTemplate

Cluster Functions Helper to Parse a Brew Template
chunk

Chunk Jobs for Sequential Execution
getJobTable

Query Job Information
getStatus

Summarize the Computational Status
syncRegistry

Synchronize the Registry
testJob

Run Jobs Interactively
doJobCollection

Execute Jobs of a JobCollection
estimateRuntimes

Estimate Remaining Runtimes
grepLogs

Grep Log Files for a Pattern
killJobs

Kill Jobs
runOSCommand

Run OS Commands on Local or Remote Machines
sweepRegistry

Check Consistency and Remove Obsolete Information
waitForJobs

Wait for Termination of Jobs
makeClusterFunctionsMulticore

ClusterFunctions for Parallel Multicore Execution
makeClusterFunctionsOpenLava

ClusterFunctions for OpenLava
makeRegistry

Registry Constructor
makeSubmitJobResult

Create a SubmitJobResult
removeExperiments

Remove Experiments
removeRegistry

Remove a Registry from the File System
execJob

Execute a Single Jobs
findJobs

Find and Filter Jobs
makeClusterFunctionsInteractive

ClusterFunctions for Sequential Execution in the Running R Session
makeClusterFunctionsLSF

ClusterFunctions for LSF Systems
reduceResults

Reduce Results
reduceResultsList

Apply Functions on Results
saveRegistry

Store the Registy to the File System