Learn R Programming

NMF (version 0.5.06)

nmf-methods: Main Interface to run NMF algorithms

Description

This method implements the main interface to launch NMF algorithms within the framework defined in package NMF. It allows to combine NMF algorithms with seeding methods. The returned object can be directly passed to visualisation or comparison methods. For a tutorial on how to use the interface, please see the package's vignette: vignette('NMF')

Usage

## S3 method for class 'matrix,numeric,function':
nmf(x, rank, method, name, objective='euclidean', model='NMFstd' 
, mixed=FALSE, ...)

## S3 method for class 'matrix,numeric,character':
nmf(x, rank, method, ...)

## S3 method for class 'matrix,numeric,NMFStrategy':
nmf(x, rank, method, seed=nmf.getOption('default.seed')
, nrun=1, model=NULL, .options=list()
, .pbackend = nmf.getOption("parallel.backend")
, .callback = NULL
, ...)

Arguments

method
The algorithm to use to perform NMF on x. Different formats are allowed: character, function. If missing, the method to use is retrieved from the NMF package's specific options by nmf.getOption("defau
mixed
Boolean that states if the algorithm requires a nonnegative input matrix (mixed=FALSE which is the default value) or accepts mixed sign input matrices (mixed=TRUE). An error is thrown if the sign required is not ful
model
When method is a function, argument model must be either a single character string (default to 'NMFstd') or a list that specifies values for slots in the NMF model. The NMF mo
name
A character string to be used as a name for the custom NMF algorithm.
nrun
Used to perform multiple runs of the algorithm. It specifies the number of runs to perform . This argument is useful to achieve stability when using a random seeding method.
objective
Used when method is a function. It must be A character string giving the name of a built-in distance method or a function to be used as the objective function. It is used to compute the resid
.callback
Used when option keep.all=FALSE (default). It allows to pass a callback function that is called after each run when performing multiple runs (i.e. with nrun>1). This is useful for example if one is also interested in sa
.options
this argument is used to set some runtime options. It can be list containing the named options and their values, or, in the case only boolean options need to be set, a character string that specifies which options are turned on or
.pbackend
define the parallel backend (from the foreach package) to use when running in parallel mode. See options p and P in argument .options. Currently it ac
rank
The factorization rank to achieve [i.e a single positive numeric]
seed
The seeding method to use to compute the starting point passed to the algorithm. See section Seeding methods for more details on the possible classes and types for argument seed.
x
The target object to estimate. It can be a matrix, a data.frame, an ExpressionSet object (this requires the Biobase package to
...
Extra parameters passed to the NMF algorithm's run method or used to initialise the NMF model slots. If argument model is not supplied as a list, ANY of the arguments in ... that have the sa

Value

  • The returned value depends on the run mode:
  • Single run:An object that inherits from class NMF.
  • Multiple runs, single method:When nrun > 1 and method is NOT a list, this method returns an object of class NMFfitX.
  • Multiple runs, multiple methods:When nrun > 1 and method is a list, this method returns an object of class NMFList.

Optimized C++ vs. plain R

Lee and Seung's multiplicative updates are used by several NMF algorithms. To improve speed and memory usage, a C++ implementation of the specific matrix products is used whenever possible. It directly computes the updates for each entry in the updated matrix, instead of using multiple standard matrix multiplication. The algorithms that benefit from this optimization are: 'brunet', 'lee', 'nsNMF' and 'offset'.However there still exists plain R versions for these methods, which implement the updates as standard matrix products. These are accessible by adding the prefix '.R#' to their name: '.R#brunet', '.R#lee', '.R#nsNMF' and '.R#offset'.

References

Lee, D.~D. and Seung, H.~S. (2000). Algorithms for non-negative matrix factorization. In NIPS, 556--562. Brunet, J.~P., Tamayo, P., Golub, T.~R., and Mesirov, J.~P. (2004). Metagenes and molecular pattern discovery using matrix factorization. Proc Natl Acad Sci U S A, 101(12), 4164--4169. Original MATLAB code available from: http://www.broadinstitute.org/cancer/pub/nmf Pascual-Montano, A., Carazo, J.~M., Kochi, K., Lehmann, D., and Pascual-Marqui, R.~D. (2006). Nonsmooth nonnegative matrix factorization (nsnmf). IEEE transactions on pattern analysis and machine intelligence, 8(3), 403--415. Kim, H. and Park, H. (2007). Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis. Bioinformatics. 2007; 23(12):1495-502. Original MATLAB code available from: http://www.cc.gatech.edu/~hpark/software/nmfsh_comb.m http://www.cc.gatech.edu/~hpark/software/fcnnls.m Liviu Badea (2008). Extracting Gene Expression Profiles Common To Colon And Pancreatic Adenocaricinoma Using Simultaneous Nonnegative Matrix Factorization. In Pacific Symposium on Biocomputing, 13, 279--290 Zhang J, Wei L, Feng X, Ma Z, Wang Y (2008). Pattern expression nonnegative matrix factorization: algorithm and applications to blind source separation. Computational intelligence and neuroscience C. Boutsidis and E. Gallopoulos (2007) SVD-based initialization: A head start for nonnegative matrix factorization. Pattern Recognition. doi:10.1016/j.patcog.2007.09.010 Original MATLAB code available from: http://www.cs.rpi.edu/~boutsc/papers/paper1/nndsvd.m

See Also

class NMF, NMF-utils, package's vignette

Examples

Run this code
## DATA
# generate a synthetic dataset with known classes: 100 features, 23 samples (10+5+8)
n <- 100; counts <- c(10, 5, 8); p <- sum(counts) 
V <- syntheticNMF(n, counts, noise=TRUE)
dim(V)

# build the class factor
groups <- as.factor(do.call('c', lapply(seq(3), function(x) rep(x, counts[x]))))

## RUN NMF ALGORITHMS

# run default algorithm
res <- nmf(V, 3)
res
summary(res, class=groups)

# run default algorithm multiple times (only keep the best fit)
res <- nmf(V, 3, nrun=10)
res
summary(res, class=groups)

# run default algorithm multiple times keeping all the fits
res <- nmf(V, 3, nrun=10, .options='k')
res
summary(res, class=groups)

## Note: one could have equivalently done
res <- nmf(V, 3, nrun=10, .options=list(keep.all=TRUE))

# run nonsmooth NMF algorithm
res <- nmf(V, 3, 'nsNMF')
res
summary(res, class=groups)

## Note: partial match also works
nmf(V, 3, 'ns')

# Non default values for the algorithm's parameters can be specified in '...'
res <- nmf(V, 3, 'nsNMF', theta=0.8)

# compare some NMF algorithms (tracking the residual error)
res <- nmf(V, 3, list('brunet', 'lee', 'nsNMF'), seed=123456, .opt='t')
res
summary(res, class=groups)
# plot the track of the residual errors
plot(res)

# run on an ExpressionSet (requires package Biobase)
data(esGolub)
nmf(esGolub, 3)

## USING SEEDING METHODS

# run default algorithm with the Non-negative Double SVD seeding method ('nndsvd')
nmf(V, 3, seed='nndsvd')

## Note: partial match also works
nmf(V, 3, seed='nn')

# run nsNMF algorithm, fixing the seed of the random number generator 
nmf(V, 3, 'nsNMF', seed=123456)

# run default algorithm specifying the starting point following the NMF standard model
start.std <- nmfModel(W=matrix(0.5, n, 3), H=matrix(0.2, 3, p))   
nmf(V, seed=start.std)

# to run nsNMF algorithm with an explicit starting point, this one
# needs to follow the 'NMFns' model:
start.ns <- nmfModel(model='NMFns', W=matrix(0.5, n, 3), H=matrix(0.2, 3, p))   
nmf(V, seed=start.ns)
# Note: the method name does not need to be specified as it is infered from the 
# when there is only one algorithm defined for the model.

# if the model is not appropriate (as defined by the algorihtm) an error is thrown 
# [cf. the standard model doesn't include a smoothing parameter used in nsNMF] 
nmf(V, method='ns', seed=start.std)

## Callback functions
# Pass a callback function to only save summary measure of each run
res <- nmf(V, 3, nrun=3, .callback=summary)
# the callback results are simplified into a matrix
res$.callback

# Pass a custom callback function
cb <- function(obj){ sparseness(obj) >= 0.5 }
res <- nmf(V, 3, nrun=3, .callback=cb)
res$.callback

# Passs a callback function which throws an error
cb <- function(){ i<-0; function(object){ i <<- i+1; if( i == 1 ) stop('SOME BIG ERROR'); summary(object) }}
res <- nmf(V, 3, nrun=3, .callback=cb())

Run the code above in your browser using DataLab