Convenience wrapper for softImpute() that allows
to supply a grid of values for the regularization parameter. Other
noteworthy differences with the original function are that the columns of
the data matrix are centered internally, that some of the default values are
different, and that the output is structured differently. Moreover, in case
of discrete rating-scale data, the wrapper function allows to include a
discretization step after fitting the algorithm to map the imputed values to
the rating scale of the observed values.
soft_impute(
X,
lambda = fraction_grid(reverse = TRUE),
relative = TRUE,
type = c("svd", "als"),
rank.max = NULL,
thresh = 1e-05,
maxit = 100L,
trace.it = FALSE,
final.svd = TRUE,
discretize = TRUE,
values = NULL
)An object of class "soft_impute" with the following components:
a numeric vector containing the values of the regularization parameter.
a numeric value with which the values of the regularization
parameter were multiplied. If relative = TRUE, the value returned by
lambda0() (applied to the mean-centered data
matrix), otherwise 1.
in case of a single value of lambda, an object returned by
softImpute(). Otherwise a list of such objects.
in case of a single value of lambda, a numeric matrix
containing the completed (i.e., imputed) data matrix. Otherwise a list of
such matrices.
in case of a single value of lambda, a numeric
matrix containing the completed (i.e., imputed) data matrix after the
discretization step. Otherwise a list of such matrices. This is only
returned if requested via discretize = TRUE.
The class structure is still experimental and may change in the future. The following accessor functions are available:
get_completed() to extract the completed (i.e.,
imputed) data matrix for a specified value of the regularization
parameter,
get_lambda() to extract the values of the
regularization parameter.
a matrix or data frame with missing values.
a numeric vector giving values of the regularization
parameter. See fraction_grid() for the default values.
a logical indicating whether the values of the
regularization parameter should be considered relative to a certain
reference value computed from the data at hand. If TRUE (the
default), the values of lambda are multiplied with the value
returned by lambda0() (applied to the
mean-centered data matrix).
a character string specifying the type of algorithm. Possible
values are "svd" and "als". See
softImpute() for details on the algorithms, but
note that the default value here is "svd".
a positive integer giving a rank constraint. See
softImpute() for more details, but note that the
default here is to use the minimum of the number of rows and columns minus 1
if type is "svd", and to use 2 if type is "als".
see
softImpute().
a logical indicating whether to include a discretization
step after fitting the algorithm (defaults to TRUE). In case of
discrete rating-scale data, this can be used to map the imputed values to
the discrete rating scale of the observed values.
an optional numeric vector giving the possible values of
discrete ratings. This is ignored if discretize is FALSE.
Currently, the possible values are assumed to be the same for all columns.
If NULL, the unique values of the observed parts of X are
used.
Andreas Alfons and Aurore Archimbaud
Hastie, T., Mazumder, R., Lee, J. D. and Zadeh, R. (2015) Matrix Completion and Low-Rank SVD via Fast Alternating Least Squares. Journal of Machine Learning Research, 16(104), 3367--3402.
Mazumder, R., Hastie, T. and Tibshirani, R. (2010) Spectral Regularization Algorithms for Learning Large Incomplete Matrices. Journal of Machine Learning Research, 11(80), 2287--2322.
soft_impute_tune(), fraction_grid()
# toy example derived from MovieLens 100K dataset
data("MovieLensToy")
# Soft-Impute with discretization step
fit <- soft_impute(MovieLensToy)
# extract discretized completed matrix with fifth value
# of regularization parameter
X_hat <- get_completed(fit, which = 5)
head(X_hat)
Run the code above in your browser using DataLab