Solving non-negative linear regression problem as $$argmin_{\beta \ge 0} L(y - x\beta) + \alpha_1 ||\beta||_2^2 + \alpha_2 \sum_{i < j} \beta_{\cdot i}^T \beta_{\cdot j}^T + \alpha_3 ||\beta||_1$$ where \(L\) is a loss function of either square error or Kullback-Leibler divergence.
nnlm(x, y, alpha = rep(0, 3), method = c("scd", "lee"),
loss = c("mse", "mkl"), init = NULL, mask = NULL, check.x = TRUE,
max.iter = 10000L, rel.tol = 1e-12, n.threads = 1L,
show.warning = TRUE)
Design matrix
Vector or matrix of response
A vector of non-negative value length equal to or less than 3, meaning [L2, angle, L1] regularization on beta
(non-masked entries)
Iteration algorithm, either 'scd' for sequential coordinate-wise descent or 'lee' for Lee's multiplicative algorithm
Loss function to use, either 'mse' for mean square error or 'mkl' for mean KL-divergence. Note that if x
, y
contains negative values, one should always use 'mse'
Initial value of beta
for iteration. Either NULL (default) or a non-negative matrix of
Either NULL (default) or a logical matrix of the same shape as beta
, indicating if an entry should be fixed to its initial
(if init
specified) or 0 (if init
not specified).
If to check the condition number of x
to ensure unique solution. Default to TRUE
but could be slow
Maximum number of iterations
Stop criterion, relative change on x between two successive iteration. It is equal to \(2*|e2-e1|/(e2+e1)\).
One could specify a negative number to force an exact max.iter
iteration, i.e., not early stop
An integer number of threads/CPUs to use. Default to 1 (no parallel). Use 0 or a negative value for all cores
If to shown warnings if exists. Default to TRUE
An object of class 'nnlm', which is a list with components
coefficients : a matrix or vector (depend on y) of the NNLM solution, i.e., \(\beta\)
n.iteration : total number of iteration (sum over all column of beta
)
error : a vector of errors/loss as c(MSE, MKL, target.error) of the solution
options : list of information of input arguments
call : function call
The linear model is solve in column-by-column manner, which is parallelled. When \(y_{\cdot j}\) (j-th column) contains missing values,
only the complete entries are used to solve \(\beta_{\cdot j}\). Therefore, the minimum complete entries of each column should be
not smaller than number of columns of x
when penalty is not used.
method = 'scd'
is recommended, especially when the solution is probably sparse. Though both "mse" and "mkl" loss are supported for
non-negative x
and y
, only "mse" is proper when either y
or x
contains negative value. Note that loss "mkl"
is much slower then loss "mse", which might be your concern when x
and y
is extremely large.
mask
is can be used for hard regularization, i.e., forcing entries to their initial values (if init
specified) or 0 (if
init
not specified). Internally, mask
is achieved by skipping the masked entries during the element-wse iteration.
Franc, V. C., Hlavac, V. C., Navara, M. (2005). Sequential Coordinate-Wise Algorithm for the Non-negative Least Squares Problem. Proc. Int'l Conf. Computer Analysis of Images and Patterns. Lecture Notes in Computer Science 3691. p. 407.
Lee, Daniel D., and H. Sebastian Seung. 1999. "Learning the Parts of Objects by Non-Negative Matrix Factorization." Nature 401: 788-91.
Pascual-Montano, Alberto, J.M. Carazo, Kieko Kochi, Dietrich Lehmann, and Roberto D.Pascual-Marqui. 2006. "Nonsmooth Nonnegative Matrix Factorization (NsNMF)." IEEE Transactions on Pattern Analysis and Machine Intelligence 28 (3): 403-14.
# NOT RUN {
# without negative value
x <- matrix(runif(50*20), 50, 20);
beta <- matrix(rexp(20*2), 20, 2);
y <- x %*% beta + 0.1*matrix(runif(50*2), 50, 2);
beta.hat <- nnlm(x, y, loss = 'mkl');
# with negative values
x2 <- 10*matrix(rnorm(50*20), 50, 20);
y2 <- x2 %*% beta + 0.2*matrix(rnorm(50*2), 50, 2);
beta.hat2 <- nnlm(x, y);
# }
Run the code above in your browser using DataLab