project: Project a linear factor model

Description

Solves the equation $A = wh$ for either $h$ or $w$ given either $w$ or $h$ and $A$

Usage

project(A, w = NULL, h = NULL, nonneg = TRUE, L1 = 0, mask_zeros = FALSE)

Arguments

matrix of features-by-samples in dense or sparse format (preferred classes are "matrix" or "Matrix::dgCMatrix", respectively). Prefer sparse storage when more than half of all values are zero.

dense matrix of factors x features giving the linear model to be projected (if h = NULL)

dense matrix of factors x samples giving the linear model to be projected (if w = NULL)

nonneg

enforce non-negativity

L1/LASSO penalty to be applied. No scaling is performed. See details.

mask_zeros

handle zeros as missing values, available only when A is sparse

Value

matrix $h$ or $w$

Details

For the classical alternating least squares matrix factorization update problem $A = wh$, the updates (or projection) of $h$ is given by the equation: $$w^Twh = wA_j$$ which is in the form $ax = b$ where $a = w^Tw$ $x = h$ and $b = wA_j$ for all columns $j$ in $A$.

Given $A$, project can solve for either $w$ or $h$ given the other:

When given $A$ and $w$, $h$ is found using a highly efficient parallelization scheme.
When given $A$ and $h$, $w$ is found without transposition of $A$ (as would be the case in traditional block-pivoting matrix factorization) by accumulating the right-hand sides of linear systems in-place in $A$, then solving the systems. Note that $w$ may also be found by inputting the transpose of $A$ and $h$ in place of $w$, (i.e. A = t(A), w = h, h = NULL). However, for most applications, the cost of a single projection in-place is less than transposition of $A$. However, for matrix factorization, it is desirable to transpose $A$ if possible because many projections are needed.

Parallelization. Least squares projections in factorizations of rank-3 and greater are parallelized using the number of threads set by setRcppMLthreads. By default, all available threads are used, see getRcppMLthreads. The overhead of parallization is too great for rank-1 and -2 factorization.

L1 Regularization. Any L1 penalty is subtracted from $b$ and should generally be scaled to max(b), where $b = WA_j$ for all columns $j$ in $A$. An easy way to properly scale an L1 penalty is to normalize all columns in $w$ to sum to 1. No scaling is applied in this function. Such scaling guarantees that L1 = 1 gives a completely sparse solution.

Specializations. There are specializations for symmetric input matrices, and for rank-1 and rank-2 projections. See documentation for nmf for theoretical details and guidance.

Publication reference. For theoretical and practical considerations, please see our manuscript: "DeBruine ZJ, Melcher K, Triche TJ (2021) High-performance non-negative matrix factorization for large single cell data." on BioRXiv.

References

DeBruine, ZJ, Melcher, K, and Triche, TJ. (2021). "High-performance non-negative matrix factorization for large single-cell data." BioRXiv.

Examples

Run this code

# NOT RUN {
library(Matrix)
w <- matrix(runif(1000 * 10), 1000, 10)
h_true <- matrix(runif(10 * 100), 10, 100)
# A is the crossproduct of "w" and "h" with 10% signal dropout
A <- (w %*% h_true) * (rsparsematrix(1000, 100, 0.9) > 0)
h <- project(A, w)
cor(as.vector(h_true), as.vector(h))

# alternating projections refine solution (like NMF)
mse_bad <- mse(A, w, rep(1, ncol(w)), h) # mse before alternating updates
h <- project(A, w = w)
w <- project(A, h = h)
h <- project(A, w)
w <- project(A, h = h)
h <- project(A, w)
w <- t(project(A, h = h))
mse_better <- mse(A, w, rep(1, ncol(w)), h) # mse after alternating updates
mse_better < mse_bad

# two ways to solve for "w" that give the same solution
w <- project(A, h = h)
w2 <- project(t(A), w = t(h))
all.equal(w, w2)
# }

Run the code above in your browser using DataLab