Learn R Programming

⚠️There's a newer version (0.5-3) of this package.Take me there.

pbdMPI

  • License:
  • Download:
  • Status:
  • Author: See section below.

With few exceptions (ff, bigalgebra, etc.), R does computations in memory. When data becomes too large to handle in the memory of a single node, or when more processors than those offered in commodity hardware (~16) are needed for a job, a typical strategy is to add more nodes. MPI, or the "Message Passing Interface", is the standard for managing multi-node communication. pbdMPI is a package that greatly simplifies the use of MPI from R.

In pbdMPI, we make extensive use of R's S4 system to simplify the interface significantly. Instead of needing to specify the type (e.g., integer or double) of the data via function name (as in C implementations) or in an argument (as in Rmpi), you need only call the generic function on your data and we will always "do the right thing".

In pbdMPI, we write programs in the "Single Program/Multiple Data" or SPMD style. Contrary to the way much of the R world is aquainted with parallelism, there is no "master" or "manager". Each process (MPI rank) gets runs the same copy of the program as every other process, but operates on its own data. This is arguably one of the simplest extensions of serial to massively parallel programming, and has been the standard way of doing things in the HPC community for over 20 years.

Usage

If you are comfortable with MPI concepts, you should find pbdMPI very agreeable and simple to use. Below is a basic "hello world" program:

# load the package
suppressMessages(library(pbdMPI, quietly = TRUE))

# initialize the MPI communicators
init()

# Hello world
message <- paste("Hello from rank", comm.rank(), "of", comm.size())
comm.print(message, all.rank=TRUE, quiet=TRUE)

# shut down the communicators and exit
finalize()

Save this as, say, mpi_hello_world.r and run it via:

mpirun -np 4 Rscript mpi_hello_world.r

The function comm.print() is a "sugar" function custom to pbdMPI that makes it simple to print in a distributed environment. The argument all.rank=TRUE specifies that all MPI ranks should print, and the quiet=TRUE argument tells each rank not to "announce" itself when it does its printing.

Numerous other examples can be found in both the pbdMPI vignette as well as the pbdDEMO package and its corresponding vignette.

Installation

pbdMPI requires

  • R version 3.0.0 or higher
  • A system installation of MPI:
    • SUN HPC 8.2.1 (OpenMPI) for Solaris.
    • OpenMPI for Linux.
    • OpenMPI for Mac OS X.
    • MS-MPI for Windows.

The package can be installed from the CRAN via the usual install.packages("pbdMPI"), or via the devtools package:

library(devtools)
install_github("RBigData/pbdMPI")

For additional installation information, see:

  • see "INSTALL" for Solaris, Linux and Mac OS.
  • see "INSTALL.win.*" for Windows.

More information about pbdMPI, including installation troubleshooting, can be found in:

  1. pbdMPI vignette at 'pbdMPI/inst/doc/pbdMPI-guide.pdf'.
  2. 'http://r-pbd.org/'.

Authors

pbdMPI is authored and maintained by the pbdR core team:

  • Wei-Chen Chen
  • George Ostrouchov
  • Drew Schmidt
  • Pragneshkumar Patel

With additional contributions from:

  • Hao Yu
  • Christian Heckendorf
  • Brian Ripley (Windows HPC Pack 2012)
  • The R Core team (some functions are modified from the base packages)

Copy Link

Version

Install

install.packages('pbdMPI')

Monthly Downloads

908

Version

0.4-4

License

Mozilla Public License 2.0

Maintainer

Wei-Chen Chen

Last Published

November 6th, 2021

Functions in pbdMPI (0.4-4)

SPMD Control

Sets of controls in pbdMPI.
Set global pbd options

Set Global pbdR Options
allgather-method

All Ranks Gather Objects from Every Rank
pbdMPI-package

Programming with Big Data -- Interface to MPI
scatter-method

A Rank Scatter Objects to Every Rank
reduce-method

A Rank Receive a Reduction of Objects from Every Rank
gather-method

A Rank Gathers Objects from Every Rank
info

Info Functions
is.comm.null

Check if a MPI_COMM_NULL
send-method

A Rank Send (blocking) an Object to the Other Rank
irecv-method

A Rank Receives (Nonblocking) an Object from the Other Rank
bcast-method

A Rank Broadcast an Object to Every Rank
sendrecv.replace-method

Send and Receive an Object to and from Other Ranks
MPI array pointers

Set or Get MPI Array Pointers in R
communicator

Communicator Functions
allreduce-method

All Ranks Receive a Reduction of Objects from Every Rank
sendrecv-method

Send and Receive an Object to and from Other Ranks
recv-method

A Rank Receives (Blocking) an Object from the Other Rank
alltoall

All to All
probe

Probe Functions
isend-method

A Rank Send (Nonblocking) an Object to the Other Rank
Utility execmpi

Execute MPI code in system
wait

Wait Functions
global as.gbd

Global As GBD Function
seed for RNG

Seed Functions for Random Number Generators
global pairwise

Global Pairwise Evaluations
sourcetag

Functions to Obtain source and tag
global balanc

Global Balance Functions
global match.arg

Global Argument Matching
global reading

Global Reading Functions
global writing

Global Writing Functions
global distance function

Global Distance for Distributed Matrices
global base

Global Base Functions
global Rprof

A Rprof Function for SPMD Routines
global any and all

Global Any and All Functions
global all pairs

Global All Pairs
Package Tools

Functions for Get/Print MPI_COMM Pointer (Address)
Comm Internal Functions

All Comm Internal Functions
apply and lapply

Parallel Apply and Lapply Functions
global which, which.max, and which.min

Global Which Functions
global stop and warning

Global Stop and Warning Functions
global timer

A Timing Function for SPMD Routines
Get Configures Used at Compiling Time

Functions to Get MPI and/or pbdMPI Configures Used at Compiling Time
global range, max, and min

Global Range, Max, and Min Functions
get job id

Divide Job ID by Ranks
global sort

Global Quick Sort for Distributed Vectors or Matrices
Task Pull

Functions for Task Pull Parallelism
global print and cat

Global Print and Cat Functions
SPMD Control Functions

Sets of controls in pbdMPI.
SPMD Internal Functions

All SPMD Internal Functions