recommenderlab (version 1.0.6)

evaluate: Evaluate a Recommender Models


Evaluates a single or a list of recommender model given an evaluation scheme and return evaluation metrics.


evaluate(x, method, ...)

# S4 method for evaluationScheme,character evaluate(x, method, type="topNList", n=1:10, parameter=NULL, progress = TRUE, keepModel=FALSE) # S4 method for evaluationScheme,list evaluate(x, method, type="topNList", n=1:10, parameter=NULL, progress = TRUE, keepModel=FALSE)


If a single recommender method is specified in method, then an object of class "evaluationResults" is returned. If method is a list of recommendation models, then an object of class "evaluationResultList" is returned.



an evaluation scheme (class "evaluationScheme").


a character string or a list. If a single character string is given it defines the recommender method used for evaluation. If several recommender methods need to be compared, method contains a nested list. Each element describes a recommender method and consists of a list with two elements: a character string named "name" containing the method and a list named "parameters" containing the parameters used for this recommender method. See Recommender for available methods.


evaluate "topNList" or "ratings"?


a vector of the different values for N used to generate top-N lists (only if type="topNList").


a list with parameters for the recommender algorithm (only used when method is a single method).


logical; report progress?


logical; store used recommender models?


further arguments.


The evaluation uses the specification in the evaluation scheme to train a recommender models on training data and then evaluates the models on test data. The result is a set of accuracy measures averaged over the test users. See calcPredictionAccuracy for details on the accuracy measures and the averaging. Note: Also the confusion matrix counts are averaged over users and therefore not whole numbers.

See vignette("recommenderlab") for more details on the evaluaiton process and the used metrics.

See Also

calcPredictionAccuracy, evaluationScheme, evaluationResults. evaluationResultList.


Run this code
### evaluate top-N list recommendations on a 0-1 data set
## Note: we sample only 100 users to make the example run faster
MSWeb10 <- sample(MSWeb[rowCounts(MSWeb) >10,], 100)

## create an evaluation scheme (10-fold cross validation, given-3 scheme)
es <- evaluationScheme(MSWeb10, method="cross-validation",
        k=10, given=3)

## run evaluation
ev <- evaluate(es, "POPULAR", n=c(1,3,5,10))

## look at the results (the length of the topNList is shown as column n)

## get a confusion matrices averaged over the 10 folds
plot(ev, annotate = TRUE)

## evaluate several algorithms (including a hybrid recommender) with a list
algorithms <- list(
  RANDOM = list(name = "RANDOM", param = NULL),
  POPULAR = list(name = "POPULAR", param = NULL),
  HYBRID = list(name = "HYBRID", param =
      list(recommenders = list(
          RANDOM = list(name = "RANDOM", param = NULL),
          POPULAR = list(name = "POPULAR", param = NULL)

evlist <- evaluate(es, algorithms, n=c(1,3,5,10))

## select the first results by index

plot(evlist, legend="topright")

### Evaluate using a data set with real-valued ratings
## Note: we sample only 100 users to make the example run faster
es <- evaluationScheme(Jester5k[1:100], method="split",
  train=.9, given=10, goodRating=5)
## Note: goodRating is used to determine positive ratings

## predict top-N recommendation lists
## (results in TPR/FPR and precision/recall)
ev <- evaluate(es, "RANDOM", type="topNList", n=10)

## predict missing ratings
## (results in RMSE, MSE and MAE)
ev <- evaluate(es, "RANDOM", type="ratings")

Run the code above in your browser using DataLab