Learn R Programming

⚠️There's a newer version (0.5-3) of this package.Take me there.

pbdMPI

  • License:
  • Download:
  • Status:
  • Author: See section below.

With few exceptions (ff, bigalgebra, etc.), R does computations in memory. When data becomes too large to handle in the memory of a single node, or when more processors than those offered in commodity hardware (~16) are needed for a job, a typical strategy is to add more nodes. MPI, or the "Message Passing Interface", is the standard for managing multi-node communication. pbdMPI is a package that greatly simplifies the use of MPI from R.

In pbdMPI, we make extensive use of R's S4 system to simplify the interface significantly. Instead of needing to specify the type (e.g., integer or double) of the data via function name (as in C implementations) or in an argument (as in Rmpi), you need only call the generic function on your data and we will always "do the right thing".

In pbdMPI, we write programs in the "Single Program/Multiple Data" or SPMD style. Contrary to the way much of the R world is aquainted with parallelism, there is no "master" or "manager". Each process (MPI rank) gets runs the same copy of the program as every other process, but operates on its own data. This is arguably one of the simplest extensions of serial to massively parallel programming, and has been the standard way of doing things in the HPC community for over 20 years.

Usage

If you are comfortable with MPI concepts, you should find pbdMPI very agreeable and simple to use. Below is a basic "hello world" program:

# load the package
suppressMessages(library(pbdMPI, quietly = TRUE))

# initialize the MPI communicators
init()

# Hello world
message <- paste("Hello from rank", comm.rank(), "of", comm.size())
comm.print(message, all.rank=TRUE, quiet=TRUE)

# shut down the communicators and exit
finalize()

Save this as, say, mpi_hello_world.r and run it via:

mpirun -np 4 Rscript mpi_hello_world.r

The function comm.print() is a "sugar" function custom to pbdMPI that makes it simple to print in a distributed environment. The argument all.rank=TRUE specifies that all MPI ranks should print, and the quiet=TRUE argument tells each rank not to "announce" itself when it does its printing.

Numerous other examples can be found in both the pbdMPI vignette as well as the pbdDEMO package and its corresponding vignette.

Installation

pbdMPI requires

  • R version 3.0.0 or higher
  • A system installation of MPI:
    • SUN HPC 8.2.1 (OpenMPI) for Solaris.
    • OpenMPI for Linux.
    • OpenMPI for Mac OS X.
    • MS-MPI for Windows.

The package can be installed from the CRAN via the usual install.packages("pbdMPI"), or via the devtools package:

library(devtools)
install_github("RBigData/pbdMPI")

For additional installation information, see:

  • see "INSTALL" for Solaris, Linux and Mac OS.
  • see "INSTALL.win.*" for Windows.

More information about pbdMPI, including installation troubleshooting, can be found in:

  1. pbdMPI vignette at 'pbdMPI/inst/doc/pbdMPI-guide.pdf'.
  2. 'http://r-pbd.org/'.

Authors

pbdMPI is authored and maintained by the pbdR core team:

  • Wei-Chen Chen
  • George Ostrouchov
  • Drew Schmidt
  • Pragneshkumar Patel

With additional contributions from:

  • Hao Yu
  • Christian Heckendorf
  • Brian Ripley (Windows HPC Pack 2012)
  • The R Core team (some functions are modified from the base packages)

Copy Link

Version

Install

install.packages('pbdMPI')

Monthly Downloads

712

Version

0.4-6

License

Mozilla Public License 2.0

Maintainer

Wei-Chen Chen

Last Published

October 25th, 2022

Functions in pbdMPI (0.4-6)

allreduce-method

All Ranks Receive a Reduction of Objects from Every Rank
Set global pbd options

Set Global pbdR Options
bcast-method

A Rank Broadcast an Object to Every Rank
scatter-method

A Rank Scatter Objects to Every Rank
gather-method

A Rank Gathers Objects from Every Rank
recv-method

A Rank Receives (Blocking) an Object from the Other Rank
isend-method

A Rank Send (Nonblocking) an Object to the Other Rank
pbdMPI-package

Programming with Big Data -- Interface to MPI
reduce-method

A Rank Receive a Reduction of Objects from Every Rank
allgather-method

All Ranks Gather Objects from Every Rank
sendrecv.replace-method

Send and Receive an Object to and from Other Ranks
alltoall

All to All
global as.gbd

Global As GBD Function
sendrecv-method

Send and Receive an Object to and from Other Ranks
SPMD Control

Sets of controls in pbdMPI.
global print and cat

Global Print and Cat Functions
send-method

A Rank Send (blocking) an Object to the Other Rank
comm.chunk

comm.chunk
info

Info Functions
seed for RNG

Seed Functions for Random Number Generators
irecv-method

A Rank Receives (Nonblocking) an Object from the Other Rank
global balanc

Global Balance Functions
Utility execmpi

Execute MPI code in system
is.comm.null

Check if a MPI_COMM_NULL
MPI array pointers

Set or Get MPI Array Pointers in R
get job id

Divide Job ID by Ranks
probe

Probe Functions
global reading

Global Reading Functions
global distance function

Global Distance for Distributed Matrices
global writing

Global Writing Functions
sourcetag

Functions to Obtain source and tag
communicator

Communicator Functions
global timer

A Timing Function for SPMD Routines
global which, which.max, and which.min

Global Which Functions
global match.arg

Global Argument Matching
global sort

Global Quick Sort for Distributed Vectors or Matrices
Comm Internal Functions

All Comm Internal Functions
global any and all

Global Any and All Functions
global base

Global Base Functions
global Rprof

A Rprof Function for SPMD Routines
global all pairs

Global All Pairs
SPMD Control Functions

Sets of controls in pbdMPI.
SPMD Internal Functions

All SPMD Internal Functions
global stop and warning

Global Stop and Warning Functions
apply and lapply

Parallel Apply and Lapply Functions
wait

Wait Functions
Task Pull

Functions for Task Pull Parallelism
global pairwise

Global Pairwise Evaluations
global range, max, and min

Global Range, Max, and Min Functions
Get Configures Used at Compiling Time

Functions to Get MPI and/or pbdMPI Configures Used at Compiling Time
Package Tools

Functions for Get/Print MPI_COMM Pointer (Address)