WRMF: Weighted Regularized Matrix Factorization for collaborative filtering

Description

Creates a matrix factorization model which is solved through Alternating Least Squares (Weighted ALS for implicit feedback). For implicit feedback see "Collaborative Filtering for Implicit Feedback Datasets" (Hu, Koren, Volinsky). For explicit feedback it corresponds to the classic model for rating matrix decomposition with MSE error. These two algorithms are proven to work well in recommender systems.

Arguments

Super class

rsparse::MatrixFactorizationRecommender -> WRMF

Methods

Public methods

Inherited methods

rsparse::MatrixFactorizationRecommender$predict()

Method `new()`

creates WRMF model

Usage

WRMF$new(
  rank = 10L,
  lambda = 0,
  dynamic_lambda = TRUE,
  init = NULL,
  preprocess = identity,
  feedback = c("implicit", "explicit"),
  solver = c("conjugate_gradient", "cholesky", "nnls"),
  with_user_item_bias = FALSE,
  with_global_bias = FALSE,
  cg_steps = 3L,
  precision = c("double", "float"),
  ...
)

Arguments

rank: size of the latent dimension

lambda

regularization parameter

dynamic_lambda

whether `lambda` is to be scaled according to the number

init

initialization of item embeddings

preprocess

identity() by default. User spectified function which will be applied to user-item interaction matrix before running matrix factorization (also applied during inference time before making predictions). For example we may want to normalize each row of user-item matrix to have 1 norm. Or apply log1p() to discount large counts. This corresponds to the "confidence" function from "Collaborative Filtering for Implicit Feedback Datasets" paper. Note that it will not automatically add +1 to the weights of the positive entries.

feedback

character - feedback type - one of c("implicit", "explicit")

solver

character - solver name. One of c("conjugate_gradient", "cholesky", "nnls"). Usually approximate "conjugate_gradient" is significantly faster and solution is on par with "cholesky". "nnls" performs non-negative matrix factorization (NNMF) - restricts user and item embeddings to be non-negative.

with_user_item_bias

bool controls if model should calculate user and item biases. At the moment only implemented for "explicit" feedback.

with_global_bias

bool controls if model should calculate global biases (mean). At the moment only implemented for "explicit" feedback.

cg_steps

integer > 0 - max number of internal steps in conjugate gradient (if "conjugate_gradient" solver used). cg_steps = 3 by default. Controls precision of linear equation solution at the each ALS step. Usually no need to tune this parameter

precision

one of c("double", "float"). Should embedding matrices be numeric or float (from float package). The latter is usually 2x faster and consumes less RAM. BUT float matrices are not "base" objects. Use carefully.

...

not used at the moment

Method `fit_transform()`

fits the model

Usage

WRMF$fit_transform(
  x,
  n_iter = 10L,
  convergence_tol = ifelse(private$feedback == "implicit", 0.005, 0.001),
  ...
)

Arguments

x: input matrix (preferably matrix in CSC format -`CsparseMatrix`

n_iter

max number of ALS iterations

convergence_tol

convergence tolerance checked between iterations

...

not used at the moment

Method `transform()`

create user embeddings for new input

Usage

WRMF$transform(x, ...)

Arguments

x: user-item iteraction matrix (preferrably as `dgRMatrix`)

...

not used at the moment

Method `clone()`

The objects of this class are cloneable with this method.

Usage

WRMF$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

References

Hu, Yifan, Yehuda Koren, and Chris Volinsky. "Collaborative filtering for implicit feedback datasets." 2008 Eighth IEEE International Conference on Data Mining. Ieee, 2008.
https://math.stackexchange.com/questions/1072451/analytic-solution-for-matrix-factorization-using-alternating-least-squares/1073170#1073170
http://activisiongamescience.github.io/2016/01/11/Implicit-Recommender-Systems-Biased-Matrix-Factorization/
https://jessesw.com/Rec-System/
http://www.benfrederickson.com/matrix-factorization/
http://www.benfrederickson.com/fast-implicit-matrix-factorization/
Franc, Vojtech, Vaclav Hlavac, and Mirko Navara. "Sequential coordinate-wise algorithm for the non-negative least squares problem." International Conference on Computer Analysis of Images and Patterns. Springer, Berlin, Heidelberg, 2005.
Zhou, Yunhong, et al. "Large-scale parallel collaborative filtering for the netflix prize." International conference on algorithmic applications in management. Springer, Berlin, Heidelberg, 2008.

Examples

Run this code

data('movielens100k')
train = movielens100k[1:900, ]
cv = movielens100k[901:nrow(movielens100k), ]
model = WRMF$new(rank = 5,  lambda = 0, feedback = 'implicit')
user_emb = model$fit_transform(train, n_iter = 5, convergence_tol = -1)
item_emb = model$components
preds = model$predict(cv, k = 10, not_recommend = cv)

Run the code above in your browser using DataLab

Description

Arguments

Super class

Methods

Public methods

Method new()

Usage

Arguments

Method fit_transform()

Usage

Arguments

Method transform()

Usage

Arguments

Method clone()

Usage

Arguments

References

Examples

Method `new()`

Method `fit_transform()`

Method `transform()`

Method `clone()`