Learn R Programming

⚠️There's a newer version (0.5-3) of this package.Take me there.

pbdMPI

  • Version: 0.3-4
  • License:
  • Download:
  • Status:
  • Author: See section below.

With few exceptions (ff, bigalgebra, etc.), R does computations in memory. When data becomes too large to handle in the memory of a single node, or when more processors than those offered in commodity hardware (~16) are needed for a job, a typical strategy is to add more nodes. MPI, or the "Message Passing Interface", is the standard for managing multi-node communication. pbdMPI is a package that greatly simplifies the use of MPI from R.

In pbdMPI, we make extensive use of R's S4 system to simplify the interface significantly. Instead of needing to specify the type (e.g., integer or double) of the data via function name (as in C implementations) or in an argument (as in Rmpi), you need only call the generic function on your data and we will always "do the right thing".

In pbdMPI, we write programs in the "Single Program/Multiple Data" or SPMD style. Contrary to the way much of the R world is aquainted with parallelism, there is no "master" or "manager". Each process (MPI rank) gets runs the same copy of the program as every other process, but operates on its own data. This is arguably one of the simplest extensions of serial to massively parallel programming, and has been the standard way of doing things in the HPC community for over 20 years.

Usage

If you are comfortable with MPI concepts, you should find pbdMPI very agreeable and simple to use. Below is a basic "hello world" program:

# load the package
suppressMessages(library(pbdMPI, quietly = TRUE))

# initialize the MPI communicators
init()

# Hello world
message <- paste("Hello from rank", comm.rank(), "of", comm.size())
comm.print(message, all.rank=TRUE, quiet=TRUE)

# shut down the communicators and exit
finalize()

Save this as, say, mpi_hello_world.r and run it via:

mpirun -np 4 Rscript mpi_hello_world.r

The function comm.print() is a "sugar" function custom to pbdMPI that makes it simple to print in a distributed environment. The argument all.rank=TRUE specifies that all MPI ranks should print, and the quiet=TRUE argument tells each rank not to "announce" itself when it does its printing.

Numerous other examples can be found in both the pbdMPI vignette as well as the pbdDEMO package and its corresponding vignette.

Installation

pbdMPI requires

  • R version 3.0.0 or higher
  • A system installation of MPI:
    • SUN HPC 8.2.1 (OpenMPI) for Solaris.
    • OpenMPI for Linux.
    • OpenMPI for Mac OS X.
    • MS-MPI for Windows.

The package can be installed from the CRAN via the usual install.packages("pbdMPI"), or via the devtools package:

library(devtools)
install_github("RBigData/pbdMPI")

For additional installation information, see:

  • see "INSTALL" for Solaris, Linux and Mac OS.
  • see "INSTALL.win.*" for Windows.

More information about pbdMPI, including installation troubleshooting, can be found in:

  1. pbdMPI vignette at 'pbdMPI/inst/doc/pbdMPI-guide.pdf'.
  2. 'http://r-pbd.org/'.

Authors

pbdMPI is authored and maintained by the pbdR core team:

  • Wei-Chen Chen
  • George Ostrouchov
  • Drew Schmidt
  • Pragneshkumar Patel

With additional contributions from:

  • Hao Yu
  • Christian Heckendorf
  • Brian Ripley (Windows HPC Pack 2012)
  • The R Core team (some functions are modified from the base packages)

Copy Link

Version

Install

install.packages('pbdMPI')

Monthly Downloads

908

Version

0.3-8

License

Mozilla Public License 2.0

Maintainer

Wei-Chen Chen

Last Published

August 7th, 2018

Functions in pbdMPI (0.3-8)

Set global pbd options

Set Global pbdR Options
global balanc

Global Balance Functions
allgather-method

All Ranks Gather Objects from Every Rank
MPI array pointers

Set or Get MPI Array Pointers in R
global reading

Global Reading Functions
communicator

Communicator Functions
global writing

Global Writing Functions
bcast-method

A Rank Broadcast an Object to Every Rank
allreduce-method

All Ranks Receive a Reduction of Objects from Every Rank
send-method

A Rank Send (blocking) an Object to the Other Rank
irecv-method

A Rank Receives (Nonblocking) an Object from the Other Rank
probe

Probe Functions
pbdMPI-package

Programming with Big Data -- Interface to MPI
global Rprof

A Rprof Function for SPMD Routines
SPMD Control

Sets of controls in pbdMPI.
global stop and warning

Global Stop and Warning Functions
gather-method

A Rank Gathers Objects from Every Rank
isend-method

A Rank Send (Nonblocking) an Object to the Other Rank
global base

Global Base Functions
sendrecv.replace-method

Send and Receive an Object to and from Other Ranks
sendrecv-method

Send and Receive an Object to and from Other Ranks
global distance function

Global Distance for Distributed Matrices
SPMD Control Functions

Sets of controls in pbdMPI.
seed for RNG

Seed Functions for Random Number Generators
reduce-method

A Rank Receive a Reduction of Objects from Every Rank
SPMD Internal Functions

All SPMD Internal Functions
get job id

Divide Job ID by Ranks
sourcetag

Functions to Obtain source and tag
global as.gbd

Global As GBD Function
alltoall

All to All
global range, max, and min

Global Range, Max, and Min Functions
global print and cat

Global Print and Cat Functions
global timer

A Timing Function for SPMD Routines
global all pairs

Global All Pairs
global any and all

Global Any and All Functions
global sort

Global Quick Sort for Distributed Vectors or Matrices
Task Pull

Functions for Task Pull Parallelism
Comm Internal Functions

All Comm Internal Functions
global which, which.max, and which.min

Global Which Functions
wait

Wait Functions
Utility execmpi

Execute MPI code in system
global match.arg

Global Argument Matching
global pairwise

Global Pairwise Evaluations
apply and lapply

Parallel Apply and Lapply Functions
scatter-method

A Rank Scatter Objects to Every Rank
recv-method

A Rank Receives (Blocking) an Object from the Other Rank
info

Info Functions
is.comm.null

Check if a MPI_COMM_NULL