rsparse (version 0.4.0)

soft_impute: SoftImpute/SoftSVD matrix factorization

Description

Fit SoftImpute/SoftSVD via fast alternating least squares. Based on the paper by Trevor Hastie, Rahul Mazumder, Jason D. Lee, Reza Zadeh by "Matrix Completion and Low-Rank SVD via Fast Alternating Least Squares" - https://arxiv.org/pdf/1410.2596.pdf

Usage

soft_impute(
  x,
  rank = 10L,
  lambda = 0,
  n_iter = 100L,
  convergence_tol = 0.001,
  init = NULL,
  final_svd = TRUE
)

soft_svd( x, rank = 10L, lambda = 0, n_iter = 100L, convergence_tol = 0.001, init = NULL, final_svd = TRUE )

Arguments

x

sparse matrix. Both CSR dgRMatrix and CSC dgCMatrix are supported. CSR matrix is preffered because in this case algorithm will benefit from multithreaded CSR * dense matrix products (if OpenMP is supported on your platform). On many-cores machines this reduces fitting time significantly.

rank

maximum rank of the low-rank solution.

lambda

regularization parameter for the nuclear norm

n_iter

maximum number of iterations of the algorithms

convergence_tol

convergence tolerance. Internally functions keeps track of the relative change of the Frobenious norm of the two consequent iterations. If the change is less than convergence_tol then the process is considered as converged and function returns result.

init

svd like object with u, v, d components to initialize algorithm. Algorithm benefit from warm starts. init could be rank up rank of the maximum allowed rank. If init has rank less than max rank it will be padded automatically.

final_svd

logical whether need to make final preprocessing with SVD. This is not necessary but cleans up rank nicely - hithly recommnded to leave it TRUE.

Value

svd-like object - list(u, v, d). u, v, d components represent left, right singular vectors and singular values.

Examples

Run this code
# NOT RUN {
set.seed(42)
data('movielens100k')
k = 10
seq_k = seq_len(k)
m = movielens100k[1:100, 1:200]
svd_ground_true = svd(m)
svd_soft_svd = soft_svd(m, rank = k, n_iter = 100, convergence_tol = 1e-6)
m_restored_svd = svd_ground_true$u[, seq_k]  %*%
   diag(x = svd_ground_true$d[seq_k]) %*%
   t(svd_ground_true$v[, seq_k])
m_restored_soft_svd = svd_soft_svd$u %*%
  diag(x = svd_soft_svd$d) %*%
  t(svd_soft_svd$v)
all.equal(m_restored_svd, m_restored_soft_svd, tolerance = 1e-1)
# }

Run the code above in your browser using DataLab