pbdDMAT (version 0.5-1)

lm.fit: Fitter for Linear Models

Description

Fits a real linear model via QR with a "limited pivoting strategy", as in R's DQRDC2 (fortran).

Usage

# S4 method for ddmatrix,ddmatrix
lm.fit(x, y, tol = 1e-07,
  singular.ok = TRUE)

Arguments

x, y

numeric distributed matrices

tol

tolerance for numerical rank estimation in QR decomposition.

singular.ok

logical. If FALSE then a singular model (rank-deficient x) produces an error.

Value

Returns a list of values similar to R's lm.fit(). Namely, the list contains:

coefficients (distributed matrix) solution to the linear least squares problem
residuals (distributed matrix) difference in the numerical fit and the observed
effects (distributed matrix) t(Q) %*% y
rank (global numeric) numerical column rank
fitted.values (distributed matrix) Numerical fit A %*% x
assign NULL if lm.fit() is called directly
qr list, same as return from qr()
df.residual (global numeric) degrees of freedom of residuals

Details

Solves the linear least squares problem, which is to find an x (possibly non-uniquely) such that || Ax - b ||^2 is minimized, where A is a given n-by-p model matrix, b is a "right hand side" n-by-1 vector (multiple right hand sides can be solved at once, but the solutions are independent, i.e. not simultaneous), and "||" is the l2 norm.

Uses level 3 PBLAS and ScaLAPACK routines (modified PDGELS) to get a linear least squares solution, using the 'limited pivoting strategy' from R's DQRDC2 (unsed in DQRLS) routine as a way of dealing with (possibly) rank deficient model matrices.

A model matrix with many dependent columns will likely experience poor performance, especially at scale, due to all the data swapping that must occur to handle rank deficiency.

Examples

Run this code
# NOT RUN {
spmd.code = "
  library(pbdDMAT, quiet = TRUE)
  init.grid()
  
  # don't do this in production code
  x <- matrix(rnorm(9), 3)
  y <- matrix(rnorm(3))
  
  dx <- as.ddmatrix(x)
  dy <- as.ddmatrix(y)
  
  fit <- lm.fit(x=dx, y=dy)
  fit
  
  finalize()
"

pbdMPI::execmpi(spmd.code = spmd.code, nranks=2L)

# }

Run the code above in your browser using DataLab