An implementation of the AdaptiveImpute algorithm for matrix completion
for sparse matrices.
adaptive_impute(
X,
rank,
...,
initialization = c("svd", "adaptive-initialize", "approximate"),
max_iter = 200L,
check_interval = 1L,
epsilon = 1e-07,
additional = NULL
)# S3 method for sparseMatrix
adaptive_impute(
X,
rank,
...,
initialization = c("svd", "adaptive-initialize", "approximate"),
additional = NULL
)
# S3 method for LRMF
adaptive_impute(
X,
rank,
...,
epsilon = 1e-07,
max_iter = 200L,
check_interval = 1L
)
A low rank matrix factorization represented by an
adaptive_imputation() object.
A sparse matrix of Matrix::sparseMatrix() class.
Desired rank (integer) to use in the low rank approximation.
Must be at least 2L and at most the rank of X. Note that the rank
of X is typically unobserved and computations may be unstable or
even fail when rank is near or exceeds this threshold.
Unused additional arguments.
How to initialize the low rank approximation. Options are:
"svd" (default). In the initialization step, this treats
unobserved values as zeroes.
"adaptive-initialize". In the initialization step, this treats
unobserved values as actually unobserved. However, the current
AdaptiveInitialize implementation relies on dense matrix
computations that are only suitable for relatively small matrices.
"approximate". An approximate variant of AdaptiveInitialize
that is less computationally expensive. See adaptive_initialize
for details.
Note that initialization matters as AdaptiveImpute optimizes
a non-convex objective. The current theory shows that initializing
with AdaptiveInitialize leads to a consistent estimator, but it
isn't know if this is the case for SVD initialization. Empirically
we have found that SVD initialization works well nonetheless.
Maximum number of iterations to perform (integer). Defaults
to 200L. In practice 10 or so iterations will get you a decent
approximation to use in exploratory analysis, and and 50-100 will get
you most of the way to convergence. Must be at least 1L.
Integer specifying how often to perform convergence
checks. Defaults to 1L. In practice, check for convergence requires
a norm calculation that is expensive for large matrices and decreasing
the frequency of convergence checks will reduce computation time. Can
also be set to NULL, which case max_iter iterations of the algorithm
will occur with no possibility of stopping due to small relative change
in the imputed matrix. In this case delta will be reported as Inf.
Convergence criteria, measured in terms of relative change
in Frobenius norm of the full imputed matrix. Defaults to 1e-7.
Ignored except when alpha_method = "approximate"
in which case it controls the precise of the approximation to alpha.
The approximate computation of alpha will always understand alpha,
but the approximation will be better for larger values of additional.
We recommend making additional as large as computationally tolerable.
Cho, Juhee, Donggyu Kim, and Karl Rohe. “Asymptotic Theory for Estimating the Singular Vectors and Values of a Partially-Observed Low Rank Matrix with Noise.” Statistica Sinica, 2018. https://doi.org/10.5705/ss.202016.0205.
———. “Intelligent Initialization and Adaptive Thresholding for Iterative Matrix Completion: Some Statistical and Algorithmic Theory for Adaptive-Impute.” Journal of Computational and Graphical Statistics 28, no. 2 (April 3, 2019): 323–33. https://doi.org/10.1080/10618600.2018.1518238.
mf <- adaptive_impute(ml100k, rank = 3L, max_iter = 5L, check_interval = NULL)
mf
Run the code above in your browser using DataLab