NMF (version 0.16.1)

nmf: Running NMF algorithms

Description

The function nmf is a S4 generic defines the main interface to run NMF algorithms within the framework defined in package NMF. It has many methods that facilitates applying, developing and testing NMF algorithms.

The package vignette vignette('NMF') contains an introduction to the interface, through a sample data analysis.

Usage

nmf(x, rank, method, ...)

## S3 method for class 'matrix,numeric,NULL': nmf(x, rank, method, seed = NULL, model = NULL, ...)

## S3 method for class 'matrix,numeric,function': nmf(x, rank, method, seed, model = "NMFstd", ..., name, objective = "euclidean", mixed = FALSE)

## S3 method for class 'matrix,NMF,ANY': nmf(x, rank, method, seed, ...)

## S3 method for class 'matrix,NULL,ANY': nmf(x, rank, method, seed, ...)

## S3 method for class 'matrix,matrix,ANY': nmf(x, rank, method, seed, model = list(), ...)

## S3 method for class 'formula,ANY,ANY': nmf(x, rank, method, ..., model = NULL)

## S3 method for class 'matrix,numeric,NMFStrategy': nmf(x, rank, method, seed = nmf.getOption("default.seed"), rng = NULL, nrun = if (length(rank) > 1) 30 else 1, model = NULL, .options = list(), .pbackend = nmf.getOption("pbackend"), .callback = NULL, ...)

Arguments

x
target data to fit, i.e. a matrix-like object
rank
specification of the factorization rank. It is usually a single numeric value, but other type of values are possible (e.g. matrix), for which specific methods are implemented. See for example methods nmf,matrix,matrix,ANY.

If

method
specification of the NMF algorithm. The most common way of specifying the algorithm is to pass the access key (i.e. a character string) of an algorithm stored in the package's dedicated registry, but methods exists that handle other types of value
...
extra arguments to allow extension of the generic. Arguments that are not used in the chain of internal calls to nmf methods are passed to the function that effectively implements the algorithm that fits an NMF model on x
name
name associated with the NMF algorithm implemented by the function method [only used when method is a function].
objective
specification of the objective function associated with the algorithm implemented by the function method [only used when method is a function].

It may be either 'euclidean' or 'KL' for specify

mixed
a logical that indicates if the algorithm implemented by the function method support mixed-sign target matrices, i.e. that may contain negative values [only used when method is a function].
seed
specification of the starting point or seeding method, which will compute a starting point, usually using data from the target matrix in order to provide a good guess.

The seeding method may be specified in the following way:

[object Object],[

rng
rng specification for the run(s). This argument should be used to set the the RNG seed, while still specifying the seeding method argument seed.
model
specification of the type of NMF model to use.

It is used to instantiate the object that inherits from class NMF, that will be passed to the seeding method. The following values are supported:

nrun
number of runs to perform. It specifies the number of runs to perform. By default only one run is performed, except if rank is a numeric vector with more than one element, in which case a default of 30 runs per value of the rank are p
.options
this argument is used to set runtime options.

It can be a list containing named options with their values, or, in the case only boolean/integer options need to be set, a character string that specifies which options are turned on/o

.pbackend
specification of the foreach parallel backend to register and/or use when running in parallel mode. See options p and P in argument .options for how to enable
.callback
Used when option keep.all=FALSE (default). It allows to pass a callback function that is called after each run when performing multiple runs (i.e. with nrun>1). This is useful for example if one is also interested in sav

Value

  • The returned value depends on the run mode:
  • Single run:An object of class NMFfit.
  • Multiple runs, single method:When nrun > 1 and method is not list, this method returns an object of class NMFfitX.
  • Multiple runs, multiple methods:When nrun > 1 and method is a list, this method returns an object of class NMFList.

Optimized C++ vs. plain R

Lee and Seung's multiplicative updates are used by several NMF algorithms. To improve speed and memory usage, a C++ implementation of the specific matrix products is used whenever possible. It directly computes the updates for each entry in the updated matrix, instead of using multiple standard matrix multiplication.

The algorithms that benefit from this optimization are: 'brunet', 'lee', 'nsNMF' and 'offset'. However there still exists plain R versions for these methods, which implement the updates as standard matrix products. These are accessible by adding the prefix '.R#' to their name: '.R#brunet', '.R#lee', '.R#nsNMF' and '.R#offset'.

Seeding methods

The purpose of seeding methods is to compute initial values for the factor matrices in a given NMF model. This initial guess will be used as a starting point by the chosen NMF algorithm.

The seeding method to use in combination with the algorithm can be passed to interface nmf through argument seed. The seeding seeding methods available in registry are listed by the function nmfSeed (see list therein).

Detailed examples of how to specify the seeding method and its parameters can be found in the Examples section of this man page and in the package's vignette.

Details

The nmf function has multiple methods that compose a very flexible interface allowing to:
  • combine NMF algorithms with seeding methods and/or stopping/convergence criterion at runtime;
  • perform multiple NMF runs, which are computed in parallel whenever the host machine allows it;
  • run multiple algorithms with a common set of parameters, ensuring a consistent environment (notably the RNG settings).

The workhorse method is nmf,matrix,numeric,NMFStrategy, which is eventually called by all other methods. The other methods provides convenient ways of specifying the NMF algorithm(s), the factorization rank, or the seed to be used. Some allow to directly run NMF algorithms on different types of objects, such as data.frame or ExpressionSet objects.

References

Brunet J, Tamayo P, Golub TR and Mesirov JP (2004). "Metagenes and molecular pattern discovery using matrix factorization." _Proceedings of the National Academy of Sciences of the United States of America_, *101*(12), pp. 4164-9. ISSN 0027-8424, , .

Lee DD and Seung H (2001). "Algorithms for non-negative matrix factorization." _Advances in neural information processing systems_. .

Wang G, Kossenkov AV and Ochs MF (2006). "LS-NMF: a modified non-negative matrix factorization algorithm utilizing uncertainty estimates." _BMC bioinformatics_, *7*, pp. 175. ISSN 1471-2105, , .

Pascual-Montano A, Carazo JM, Kochi K, Lehmann D and Pascual-marqui RD (2006). "Nonsmooth nonnegative matrix factorization (nsNMF)." _IEEE Trans. Pattern Anal. Mach. Intell_, *28*, pp. 403-415.

Badea L (2008). "Extracting gene expression profiles common to colon and pancreatic adenocarcinoma using simultaneous nonnegative matrix factorization." _Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing_, *290*, pp. 267-78. ISSN 1793-5091, .

Kim H and Park H (2007). "Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis." _Bioinformatics (Oxford, England)_, *23*(12), pp. 1495-502. ISSN 1460-2059, , .

Van Benthem M and Keenan MR (2004). "Fast algorithm for the solution of large-scale non-negativity-constrained least squares problems." _Journal of Chemometrics_, *18*(10), pp. 441-450. ISSN 0886-9383, , .

See Also

nmfAlgorithm

Examples

Run this code
# Only basic calls are presented in this manpage.
# Many more examples are provided in the demo file nmf.R
demo('nmf')

# random data
x <- rmatrix(20,10)

# run default algorithm with rank 2
res <- nmf(x, 2)

# specify the algorithm
res <- nmf(x, 2, 'lee')

# get verbose message on what is going on
res <- nmf(x, 2, .options='v')
# more messages
res <- nmf(x, 2, .options='v2')
# even more
res <- nmf(x, 2, .options='v3')
# and so on ...

Run the code above in your browser using DataLab