fcnnls: Fast Combinatorial Non-Negative Least-Square

Description

This function solves the following non-negative least square linear problem using normal equations and the fast combinatorial strategy from Benthem and Keenan (2004):

$$\begin{array}{l} \min \|Y - X K\|_F\ \mbox{s.t. } K>=0 \end{array}$$ where $\|.\|_F$ is the Frobenius norm.

The resulting algorithm is very fast to converge compared to other approaches.

Within the NMF package, this algorithm is used internally by the SNMF/R(L) algorithm from Kim and Park (2007) to solve general Nonnegative Matrix Factorization (NMF) problems, using alternating non-negative constrained least-squares. That is by iteratively and alternatively estimate each matrix factor (see section References).

It is provided separately so that it can be used to solve other types of non-negative least squares problem. For faster computation, please the internal -- non-exported -- function NMF:::.fcnnls The code is a port from the original MATLAB code used in Kim and Park (2007) (see references).

Usage

## S3 method for class 'matrix,matrix':
fcnnls(x, y, verbose=FALSE, pseudo=TRUE, ...)

Arguments

the coefficient matrix

the target matrix to be approximated by $X K$.

verbose

toggle verbosity (default is FALSE).

pseudo

By default (pseudo=FALSE) the algorithm uses Gaussian elimination to solve the successive internal linear problems, using the solve function. If pseudo=TRUE the algorithm uses M

...

extra arguments passed to the internal function .fcnnls. Currently not used.

Value

The returned value is a list containing the following components:
xthe estimated optimal matrix $K$.
fittedthe fitted matrix $X K$.
residualsthe residual matrix $Y - X K$.
deviancethe residual sum of squares between the fitted matrix $X K$ and the target matrix $Y$. That is the sum of the square residuals.
passivea $r x p$ logical matrix containing the passive set, that is the set of entries in $K$ that are not null (i.e. strictly positive).
pseudoa logical that is TRUE if the computation was performed using the pseudoinverse. See argument pseudo.

Details

Given two real matrices $Y$ and $X$, of dimension $n \times p$ and $n \times r$ respectively, this algorithm solves for the optimal nonnegative matrix $K$ ($r \times p$) such that: $$\begin{array}{l} \min \|Y - X K\|_F\ \mbox{s.t. } K>=0 \end{array}$$ where $\|.\|_F$ is the Frobenius norm.

It is based on the active/passive set method. It uses the unconstrained solution $K_u$ obtained from the unconstrained least squares problem, i.e. $\min \|Y - X K\|_F^2$ , so as to determine the initial passive sets.

References

M. H. van Benthem and M. R. Keenan (2004). Fast algo-rithm for the solution of large-scale non-negativity-constrained least squares problems. J. Chemometrics 2004, 18:441-450.

Kim, H. and Park, H. (2007). Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis. Bioinformatics 2007; 23(12):1495-502. Original MATLAB code from Van Benthem and Keenan, slightly modified by H. Kim: http://www.cc.gatech.edu/~hpark/software/fcnnls.m

Examples

Run this code

## Define a random non-negative matrix matrix
n <- 200; p <- 20; r <- 3
V <- matrix(runif(n*p), n, p)

## Compute the optimal matrix K for a given X matrix
X <- matrix(runif(n*r), n, r)
res <- fcnnls(X, V)

## Compute the same thing using the Moore-Penrose generalized pseudoinverse
res <- fcnnls(X, V, pseudo=TRUE)

## It also works in the case of single vectors
y <- runif(n)
res <- fcnnls(X, y)
# or
res <- fcnnls(X[,1], y)

Run the code above in your browser using DataLab