Learn R Programming

batchr

batchr is an R package to batch process files using an R function.

The key design principle is that only files which were last modified before the directory was configured are processed. A hidden file stores the configuration time and function etc while successfully processed files are automatically touched to update their modification date.

As a result batch processing can be stopped and restarted and any files created (or modified or deleted) during processing are ignored.

To allow the user control over the reprocessing of problematic files, all processing attempts (SUCCESS or FAILURE) are recorded in a hidden log file.

Installation

You can install the released version of batchr from CRAN with:

install.packages("batchr")

And the development version from GitHub with:

# install.packages("remotes")
remotes::install_github("poissonconsulting/batchr")

Demonstration

Consider a directory with two .csv files

path <- file.path(tempdir(), "example")
unlink(path, force = TRUE)
dir.create(path)

write.csv(data.frame(x = 1), file.path(path, "file1.csv"), row.names = FALSE)
write.csv(data.frame(x = 3), file.path(path, "file2.csv"), row.names = FALSE)

First define the function to process them.

fun <- function(file) {
  data <- read.csv(file)
  data$x <- data$x * 2
  write.csv(data, file, row.names = FALSE)
  TRUE
}

Then simply call batch_process() to apply the function to all the files.

library(batchr)
batch_process(fun, path, ask = FALSE)
#> ✓ file1.csv [00:00:00.002]
#> ✓ file2.csv [00:00:00.005]
#> Success: 2
#> Failure: 0
#> Remaining: 0
#> 

The files have been updated as follows.

read.csv(file.path(path, "file1.csv"))
#>   x
#> 1 2
read.csv(file.path(path, "file2.csv"))
#>   x
#> 1 6

For a more realistic demonstration with finer control over the batch processing see the Batchr Demonstration vignette.

Parallel Chains

To process the files in parallel simply set

library(future)
plan(multisession)

Contribution

Please report any issues.

Pull requests are always welcome.

Code of Conduct

Please note that the batchr project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Copy Link

Version

Install

install.packages('batchr')

Monthly Downloads

7

Version

0.0.2

License

MIT + file LICENSE

Maintainer

Joe Thorley

Last Published

October 3rd, 2021

Functions in batchr (0.0.2)

batch_config_read

Read Configuration File
batch_report

Batch Report
batch_reconfig_fun

Reconfigures Batch Processing Function
batchr-package

batchr: Batch Process Files
batch_seeds

L'Ecuyer-CMRG Seeds
batch_run

Runs Batch Processing
batch_config

Configure Batch Processing
batch_is_clean

Is Clean
batch_log_read

Read Log File
batch_completed

Batch Completed?
batch_cleanup

Cleanup Batch Processing
batch_reconfig_fileset

Reconfigures Batch Processing File Set
batch_process

Batch File Processing
batch_files_remaining

Batch Files
batch_file_status

Batch File Status