nmf_update.brunet_R: NMF Algorithm/Updates for Kullback-Leibler Divergence

Description

The built-in NMF algorithms described here minimise the Kullback-Leibler divergence (KL) between an NMF model and a target matrix. They use the updates for the basis and coefficient matrices ($W$ and $H$) defined by Brunet et al. (2004), which are essentially those from Lee et al. (2001), with an stabilisation step that shift up all entries from zero every 10 iterations, to a very small positive value.

nmf_update.brunet implements in C++ an optimised version of the single update step.

Algorithms brunet and .R#brunet provide the complete NMF algorithm from Brunet et al. (2004), using the C++-optimised and pure R updates nmf_update.brunet and nmf_update.brunet_R respectively.

Algorithm KL provides an NMF algorithm based on the C++-optimised version of the updates from Brunet et al. (2004), which uses the stationarity of the objective value as a stopping criterion nmf.stop.stationary, instead of the stationarity of the connectivity matrix nmf.stop.connectivity as used by brunet.

library(RcppOctave) file.show(system.mfile('brunet.m', package='NMF'))

Usage

nmf_update.brunet_R(i, v, x, eps = .Machine$double.eps,
    ...)
  nmf_update.brunet(i, v, x, copy = FALSE,
    eps = .Machine$double.eps, ...)
  nmfAlgorithm.brunet_R(..., .stop = NULL, maxIter = 2000,
    eps = .Machine$double.eps, stopconv = 40,
    check.interval = 10)
  nmfAlgorithm.brunet(..., .stop = NULL, maxIter = 2000,
    copy = FALSE, eps = .Machine$double.eps, stopconv = 40,
    check.interval = 10)
  nmfAlgorithm.KL(..., .stop = NULL, maxIter = 2000,
    copy = FALSE, eps = .Machine$double.eps,
    stationary.th = .Machine$double.eps,
    check.interval = 5 * check.niter, check.niter = 10L)
  nmfAlgorithm.brunet_M(..., object, y, x)

Arguments

current iteration number.

target matrix.

current NMF model, as an NMF object.

eps

small numeric value used to ensure numeric stability, by shifting up entries from zero to this fixed value.

...

extra arguments. These are generally not used and present only to allow other arguments from the main call to be passed to the initialisation and stopping criterion functions (slots onInit and Stop respectively).

copy

logical that indicates if the update should be made on the original matrix directly (FALSE) or on a copy (TRUE - default). With copy=FALSE the memory footprint is very small, and some speed-up may be achieved

.stop

specification of a stopping criterion, that is used instead of the one associated to the NMF algorithm. It may be specified as:

the access key of a registered stopping criterion;
a single integer that specifies the exact number of

maxIter

maximum number of iterations to perform.

object

an object computed using some algorithm, or that describes an algorithm itself.

data object, e.g. a target matrix

stopconv

number of iterations intervals over which the connectivity matrix must not change for stationarity to be achieved.

check.interval

interval (in number of iterations) on which the stopping criterion is computed.

stationary.th

maximum absolute value of the gradient, for the objective function to be considered stationary.

check.niter

number of successive iteration used to compute the stationnary criterion.

source

Original MATLAB files and references can be found at:

http://www.broadinstitute.org/mpr/publications/projects/NMF/nmf.m

http://www.broadinstitute.org/publications/broad872

Original license terms:

This software and its documentation are copyright 2004 by the Broad Institute/Massachusetts Institute of Technology. All rights are reserved. This software is supplied without any warranty or guaranteed support whatsoever. Neither the Broad Institute nor MIT can not be responsible for its use, misuse, or functionality.

Details

nmf_update.brunet_R implements in pure R a single update step, i.e. it updates both matrices.

References

Brunet J, Tamayo P, Golub TR and Mesirov JP (2004). "Metagenes and molecular pattern discovery using matrix factorization." _Proceedings of the National Academy of Sciences of the United States of America_, *101*(12), pp. 4164-9. ISSN 0027-8424, , .

Lee DD and Seung H (2001). "Algorithms for non-negative matrix factorization." _Advances in neural information processing systems_. .