powered by
parallel
Start parallel clusters using parallel package
parallel_start(..., .method = c("parallel", "spark"))parallel_stop()
parallel_stop()
Parameters passed to underlying functions (See Details Section)
The method to create the parallel backend. Supports:
"parallel" - Uses the parallel and doParallel packages
doParallel
"spark" - Uses the sparklyr package
sparklyr
Performs 3 Steps:
Makes clusters using parallel::makeCluster(...). The parallel_start(...) are passed to parallel::makeCluster(...).
parallel::makeCluster(...)
parallel_start(...)
Registers clusters using doParallel::registerDoParallel().
doParallel::registerDoParallel()
Adds .libPaths() using parallel::clusterCall().
.libPaths()
parallel::clusterCall()
Important, make sure to create a spark connection using sparklyr::spark_connect().
sparklyr::spark_connect()
Pass the connection object as the first argument. For example, parallel_start(sc, .method = "spark").
parallel_start(sc, .method = "spark")
The parallel_start(...) are passed to sparklyr::registerDoSpark(...).
sparklyr::registerDoSpark(...)
# Starts 2 clusters parallel_start(2) # Returns to sequential processing parallel_stop()
Run the code above in your browser using DataLab