snow-cluster: Cluster-Level SNOW Functions

Description

Functions for computing on a SNOW cluster.

Usage

clusterSplit(cl, seq)
clusterCall(cl, fun, ...)
clusterApply(cl, x, fun, ...)
clusterApplyLB(cl, x, fun, ...)
clusterEvalQ(cl, expr)
clusterExport(cl, list, envir = .GlobalEnv)
clusterMap(cl, fun, ..., MoreArgs = NULL, RECYCLE = TRUE)

Arguments

cluster object

fun

function or character string naming a function

expr

expression to evaluate

seq

vector to split

list

character vector of variables to export

envir

environment from which t export variables

array

...

additional arguments to pass to standard function

MoreArgs

additional argument for fun

RECYCLE

logical; if true shorter arguments are recycled

Details

These are the basic functions for computing on a cluster. All evaluations on the slave nodes are done using tryCatch. Currently an error is signaled on the master if any one of the nodes produces an error. More sophisticated approaches will be considered in the future.

clusterCall calls a function fun with identical arguments ... on each node in the cluster cl and returns a list of the results.

clusterEvalQ evaluates a literal expression on each cluster node. It a cluster version of evalq, and is a convenience function defined in terms of clusterCall.

clusterApply calls fun on the first cluster node with arguments seq[[1]] and ..., on the second node with seq[[2]] and ..., and so on. If the length of seq is greater than the number of nodes in the cluster then cluster nodes are recycled. A list of the results is returned; the length of the result list will equal the length of seq.

clusterApplyLB is a load balancing version of clusterApply. if the length p of seq is greater than the number of cluster nodes n, then the first n jobs are placed in order on the n nodes. When the first job completes, the next job is placed on the available node; this continues until all jobs are complete. Using clusterApplyLB can result in better cluster utilization than using clusterApply. However, increased communication can reduce performance. Furthermore, the node that executes a particular job is nondeterministic, which can complicate ensuring reproducibility in simulations.

clusterMap is a multi-argument version of clusterApply, analogous to mapply. If RECYCLE is true shorter arguments are recycled; otherwise, the result length is the length of the shortest argument. Cluster nodes are recycled if the length of the result is greater than the number of nodes.

clusterExport assigns the values on the master of the variables named in list to variables of the same names in the global environments of each node. The environment on the master from which variables are exported defaults to the global environment.

clusterSplit splits seq into one consecutive piece for each cluster and returns the result as a list with length equal to the number of cluster nodes. Currently the pieces are chosen to be close to equal in length. Future releases may attempt to use relative performance information about nodes to choose split proportional to performance.

For more details see http://www.stat.uiowa.edu/~luke/R/cluster/cluster.html.

Examples

Run this code

# NOT RUN {
  
# }
# NOT RUN {
cl <- makeSOCKcluster(c("localhost","localhost"))

clusterApply(cl, 1:2, get("+"), 3)

clusterEvalQ(cl, library(boot))

x<-1
clusterExport(cl, "x")
clusterCall(cl, function(y) x + y, 2)

  
# }

Run the code above in your browser using DataLab