bpvec: Parallel, vectorized evaluation

Description

bpvec applies FUN to subsets of X. Any type of object X is allowed, provided length, [, and c methods are available. The return value is a vector of length equal to X, as with FUN(X).

Usage

bpvec(X, FUN, ..., AGGREGATE=c, BPREDO=list(), BPPARAM=bpparam())
"bpvec"(X, FUN, ..., AGGREGATE=c,  BPREDO=list(), BPPARAM=bpparam())
"bpvec"(X, FUN, ..., AGGREGATE=c,  BPREDO=list(), BPPARAM=bpparam())

Arguments

Any object for which methods length, [, and c are implemented.

FUN

The function to be applied to subsets of X.

...

Additional arguments for FUN.

AGGREGATE

A function taking any number of arguments ... called to reduce results (elements of the ... argument of AGGREGATE from parallel jobs. The default, c, concatenates objects and is appropriate for vectors; rbind might be appropriate for data frames.

BPPARAM

A optional BiocParallelParam instance determining the parallel back-end to be used during evaluation.

BPREDO

A list of output from bpvec with one or more failed elements. When a list is given in BPREDO, bpok is used to identify errors, tasks are rerun and inserted into the original results.

Value

The result should be identical to FUN(X, ...) (assuming that AGGREGATE is set appropriately).

Details

When BPPARAM is a MulticoreParam this method dispatches to the pvec function from the parallel package. For all other BiocParallelParams, this method creates a vector of indices for X that divide the elements as evenly as possible given the number of workers. Indices and data are passed to bplapply for parallel evaluation. SnowParam and MulticoreParam offer further control over the division of X through the tasks argument. See ?bptasks.

The distinction between bpvec and bplapply is that bplapply applies FUN to each element of X separately whereas bpvec assumes the function is vectorized, e.g., c(FUN(x[1]), FUN(x[2])) is equivalent to FUN(x[1:2]). This approach can be more efficient than bplapply but requires the assumption that FUN takes a vector input and creates a vector output of the same length as the input which does not depend on partitioning of the vector. This behavior is consistent with parallel:::pvec and the ?pvec man page should be consulted for further details.

Examples

Run this code

showMethods("bpvec")

## ten tasks (1:10), called with as many back-end elements are specified
## by BPPARAM.  Compare with bplapply
fun <- function(v) {
    message("working")
    sqrt(v)
}
system.time(result <- bpvec(1:10, fun)) 
result

Run the code above in your browser using DataLab