# pbdDMAT v0.5-0

0

0th

Percentile

## 'pbdR' Distributed Matrix Methods

A set of classes for managing distributed matrices, and a collection of methods for computing linear algebra and statistics. Computation is handled mostly by routines from the 'pbdBASE' package, which itself relies on the 'ScaLAPACK' and 'PBLAS' numerical libraries for distributed computing.

# pbdDMAT

• Version: 0.5-0
• Author: See section below.

pbdDMAT is an R package for distributed matrix algebra and statistics computations over MPI.

With few exceptions (ff, bigalgebra, etc.), R does computations in memory. If the memory of a matrix is too large for a single node, then distributing the ownership of the matrix across multiple nodes is an effective strategy in working with such large data.

The pbdDMAT package contains numerous routines to help with the distribution and management of data, as well as functions for summarizing, inspecting, and analyzing distributed matrices.

Often the syntax is identical to serial R, only instead of calling cov(x) on a matrix x, you would call it on a distributed matrix x. This is possible by extensive use of R's S3 and S4 methods.

Much of the numerical linear algebra is powered by the ScaLAPACK library, which is the distributed analogue of LAPACK, used extensively by R.

## Installation

pbdDMAT requires

• A system installation of MPI
• R version 3.0.0 or higher
• The pbdMPI and pbdBASE packages, as well as their dependencies.

Assuming you meet the system dependencies, you can install the stable version from CRAN using the usual install.packages():

install.package("pbdDMAT")


The development version is maintained on GitHub, and can easily be installed by any of the packages that offer installations from GitHub:

remotes::install_github("RBigData/pbdDMAT")


See the vignette for installation troubleshooting.

## Usage

# load the package
library(pbdDMAT)

# initialize the specialized MPI communicators
init.grid()

# create a 100x100 distributed matrix object
dx <- ddmatrix(1:100, 10)

# print
dx
print(dx, all=TRUE)

# shut down the communicators and exit
finalize()


Save this program as pbd_example.r and run it via:

mpirun -np 2 Rscript pbd_example.r


Numerous other examples can be found in both the pbdDMAT vignette, as well as the pbdDEMO package and its corresponding vignette.

## Authors

pbdDMAT is authored and maintained by the pbdR core team:

• Drew Schmidt
• Wei-Chen Chen
• George Ostrouchov
• Pragneshkumar Patel

• The R Core team (some wrapper code taken from the base and stats packages)
• ZhaoKang Wang (fixes/improvements to apply())
• Michael Lawrence (fix for as.vector())

## Functions in pbdDMAT

 Name Description chol2inv Inverse from Choleski (or QR) Decomposition companion Generate Companion Matrices condnums Compute or estimate the Condition Number of a Distributed Matrix ddmatrix-print Printing a Distributed Matrix lm.fit Fitter for Linear Models matmult Matrix Multiplication math Miscellaneous Mathematical Functions isdot Type Checks, Including NA, NaN, etc. as.ddmatrix Non-Distributed object to Distributed Object Converters ddmatrix-class Class ddmatrix sparsity Sparsity of Matrix Objects Comparators Logical Comparisons ddmatrix-constructors Distributed Matrix Creation arithmetic Arithmetic Operators ddmatrix-svd Singular Value Decomposition getLocal getLocal ddmatrix-sumstats Basic Summary Statistics sd Covariance and Correlation headsortails Head and Tail of a Distributed Matrix insert Directly Insert Into Distributed Matrix Submatrix Slot ddmatrix-apply Apply Family of Functions qr QR Decomposition Methods pbdDMAT Control Some default parameters for pbdDMAT. Hilbert Generate Hilbert Matrices ddmatrix-chol Cholesky Factorization redistribute Distribute/Redistribute matrices across the process grid ddmatrix-norm Norm as.vector Distributed object to Vector Converters ddmatrix-eigen eigen ddmatrix-solve Solve ddmatrix-summary Distributed Matrix Summary ddmatrix-prcomp Principal Components Analysis ddmatrix-lu LU Factorization sweep Sweep na Handle Missing Values in Distributed Matrices transpose Distributed Matrix Transpose diag-constructors Distributed Matrix Diagonals isSymmetric isSymmetric extract Extract or Replace Parts of a Distributed Matrix covariance Covariance and Correlation expm Matrix Exponentiation pbdDMAT-package Distributed Matrix Methods eigen2 eigen2 reductions Arithmetic Reductions: Sums, Means, and Prods rounding Rounding of Numbers Accessors Accessor Functions for Distributed Matrix Slots binds Row and Column binds for Distributed Matrices as.matrix Distributed object to Matrix Converters as.rowcyclic Distribute/Redistribute matrices across the process grid ddmatrix-scale Scale No Results!