Learn R Programming

mcprogress (version 0.1.1)

mcProgressBar: Show progress bar during parallel processing

Description

Uses echo to safely output a progress bar to Rstudio or Linux console during parallel processing.

Usage

mcProgressBar(
  val,
  len = 1L,
  cores = 1L,
  subval = NULL,
  title = "",
  spinner = FALSE,
  eta = TRUE,
  start = NULL,
  sensitivity = 0.01
)

closeProgress(start = NULL, title = "", eta = TRUE)

Value

No return value. Prints a progress bar to the console if called within an Rstudio or Linux environment.

Arguments

val

Integer measuring progress

len

Total number of processes to be executed overall.

cores

Number of cores used for parallel processing.

subval

Optional subvalue ranging from 0 to 1 to enable granularity during long processes. Especially useful if len is small relative to cores.

title

Optional title for the progress bar.

spinner

Logical whether to show a spinner which moves when each core completes a process. More useful for relatively long processes where the length of time for each process to complete is variable. Not shown if subval is used. Can add significant overhead is len is large and each process is very fast.

eta

Logical whether to show estimated time to completion. start system time must be supplied with each call to mcProgressbar() in order to estimate the time to completion.

start

Used to pass the system time from the start of the call to show a total time elapsed. See the example below.

sensitivity

Determines maximum sensitivity with which to report progress for situations where len is large, to reduce overhead. Default 0.01 refers to 1%. Not used if subval is invoked.

Author

Myles Lewis

Details

This package provides 2 main methods to show progress during parallelised code using mclapply(). If X (the list object looped over in a call to mclapply()) has many elements compared to the number of cores, then it is easiest to use pmclapply(). However, in some use cases the length of X is comparable to the number of cores and each process may take a long time. For example, machine learning applied to each of 8 folds on an 8-core machine will open 8 processes from the outset. Each process will often complete at roughly the same time. In this case pmclapply() is much less informative as it only shows completion at the end of 1 round of processes so it will go from 0% to 100%. In this example, if each process code is long and subprogress can be reported along the way, for example during nested loops, then mcProgressBar() provides a way to show the subprogress during the inner loop. The example below shows how to write code involving an outer call to mclapply() and an inner loop whose subprogress is tracked via calls to mcProgressBar().

Technically only 1 process can be tracked. If cores is set to 4 and subval is invoked, then the 1st, 5th, 9th, 13th etc process is tracked. Subprogress of this process is computed as part of the number of blocks of processes required. ETA is approximate. As part of minimising overhead, it is only updated with each change in progress (i.e. each time a block of processes completes) or when subprogress changes. It is not updated by interrupt.

See Also

pmclapply() mclapply()

Examples

Run this code
if (Sys.info()["sysname"] != "Windows") {

## Example function with mclapply wrapped around another nested function
library(parallel)

my_fun <- function(x, cores) {
  start <- Sys.time()
  mcProgressBar(0, title = "my_fun")  # initialise progress bar
  res <- mclapply(seq_along(x), function(i) {
    # inner loop of calculation
    y <- 1:4
    inner <- lapply(seq_along(y), function(j) {
      Sys.sleep(0.2 + runif(1) * 0.1)
      mcProgressBar(val = i, len = length(x), cores, subval = j / length(y),
                    title = "my_fun")
      rnorm(4)
    })
    inner
  }, mc.cores = cores)
  closeProgress(start, title = "my_fun")  # finalise the progress bar
  res
}

res <- my_fun(letters[1:4], cores = 2)

## Example of long function
longfun <- function(x, cores) {
  start <- Sys.time()
  mcProgressBar(0, title = "longfun")  # initialise progress bar
  res <- mclapply(seq_along(x), function(i) {
    # long sequential calculation in parallel with 3 major steps
    Sys.sleep(0.2)
    mcProgressBar(val = i, len = length(x), cores, subval = 0.33,
                  title = "longfun")  # 33% complete
    Sys.sleep(0.2)
    mcProgressBar(val = i, len = length(x), cores, subval = 0.66,
                  title = "longfun")  # 66% complete
    Sys.sleep(0.2)
    mcProgressBar(val = i, len = length(x), cores, subval = 1,
                  title = "longfun")  # 100% complete
    return(rnorm(4))
  }, mc.cores = cores)
  closeProgress(start, title = "longfun")  # finalise the progress bar
  res
}

res <- longfun(letters[1:2], cores = 2)

}

Run the code above in your browser using DataLab