rsparse (version 0.4.0)

WRMF: Weighted Regularized Matrix Facrtorization for collaborative filtering

Description

Creates matrix factorization model which could be solved with Alternating Least Squares (Weighted ALS for implicit feedback). For implicit feedback see "Collaborative Filtering for Implicit Feedback Datasets" (Hu, Koren, Volinsky). For explicit feedback model is classic model for rating matrix decomposition with MSE error (without biases at the moment). These two algorithms are proven to work well in recommender systems.

Arguments

Methods

Public methods

Method new()

creates WRMF model

Usage

WRMF$new(
  rank = 10L,
  lambda = 0,
  init = NULL,
  preprocess = identity,
  feedback = c("implicit", "explicit"),
  non_negative = FALSE,
  solver = c("conjugate_gradient", "cholesky"),
  cg_steps = 3L,
  precision = c("double", "float"),
  ...
)

Arguments

rank

size of the latent dimension

lambda

regularization parameter

init

initialization of item embeddings

preprocess

identity() by default. User spectified function which will be applied to user-item interaction matrix before running matrix factorization (also applied during inference time before making predictions). For example we may want to normalize each row of user-item matrix to have 1 norm. Or apply log1p() to discount large counts. This corresponds to the "confidence" function from "Collaborative Filtering for Implicit Feedback Datasets" paper.

feedback

character - feedback type - one of c("implicit", "explicit")

non_negative

logical, whether to perform non-negative factorization

solver

character - solver for "implicit feedback" problem. One of c("conjugate_gradient", "cholesky"). Usually approximate "conjugate_gradient" is significantly faster and solution is on par with "cholesky"

cg_steps

integer > 0 - max number of internal steps in conjugate gradient (if "conjugate_gradient" solver used). cg_steps = 3 by default. Controls precision of linear equation solution at the each ALS step. Usually no need to tune this parameter

precision

one of c("double", "float"). Should embeeding matrices be numeric or float (from float package). The latter is usually 2x faster and consumes less RAM. BUT float matrices are not "base" objects. Use carefully.

...

not used at the moment

Method fit_transform()

fits the model

Usage

WRMF$fit_transform(x, n_iter = 10L, convergence_tol = 0.005, ...)

Arguments

x

input matrix (preferably matrix in CSC format -`CsparseMatrix`

n_iter

max number of ALS iterations

convergence_tol

convergence tolerance checked between iterations

...

not used at the moment

Method transform()

create user embeddings for new input

Usage

WRMF$transform(x, ...)

Arguments

x

user-item iteraction matrix

...

not used at the moment

Method clone()

The objects of this class are cloneable with this method.

Usage

WRMF$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

References

Examples

Run this code
# NOT RUN {
data('movielens100k')
train = movielens100k[1:900, ]
cv = movielens100k[901:nrow(movielens100k), ]
model = WRMF$new(rank = 5,  lambda = 0, feedback = 'implicit')
user_emb = model$fit_transform(train, n_iter = 5, convergence_tol = -1)
item_emb = model$components
preds = model$predict(cv, k = 10, not_recommend = cv)
# }

Run the code above in your browser using DataCamp Workspace