evaluate: Evaluate a Recommender Models

Description

Evaluates a single or a list of recommender model given an evaluation scheme and return evaluation metrics.

Usage

evaluate(x, method, ...)
# S4 method for evaluationScheme,character
evaluate(x, method, type="topNList",
  n=1:10, parameter=NULL, progress = TRUE, keepModel=FALSE)
# S4 method for evaluationScheme,list
evaluate(x, method, type="topNList",
  n=1:10, parameter=NULL, progress = TRUE, keepModel=FALSE)

Value

If a single recommender method is specified in method, then an object of class "evaluationResults" is returned. If method is a list of recommendation models, then an object of class "evaluationResultList" is returned.

Arguments

x: an evaluation scheme (class "evaluationScheme").
method: a character string or a list. If a single character string is given it defines the recommender method used for evaluation. If several recommender methods need to be compared, method contains a nested list. Each element describes a recommender method and consists of a list with two elements: a character string named "name" containing the method and a list named "parameters" containing the parameters used for this recommender method. See Recommender for available methods.
type: evaluate "topNList" or "ratings"?
n: a vector of the different values for N used to generate top-N lists (only if type="topNList").
parameter: a list with parameters for the recommender algorithm (only used when method is a single method).
progress: logical; report progress?
keepModel: logical; store used recommender models?
...: further arguments.

Details

The evaluation uses the specification in the evaluation scheme to train a recommender models on training data and then evaluates the models on test data. The result is a set of accuracy measures averaged over the test users. See calcPredictionAccuracy for details on the accuracy measures and the averaging. Note: Also the confusion matrix counts are averaged over users and therefore not whole numbers.

See vignette("recommenderlab") for more details on the evaluaiton process and the used metrics.

Examples

Run this code

### evaluate top-N list recommendations on a 0-1 data set
## Note: we sample only 100 users to make the example run faster
data("MSWeb")
MSWeb10 <- sample(MSWeb[rowCounts(MSWeb) >10,], 100)

## create an evaluation scheme (10-fold cross validation, given-3 scheme)
es <- evaluationScheme(MSWeb10, method="cross-validation",
        k=10, given=3)

## run evaluation
ev <- evaluate(es, "POPULAR", n=c(1,3,5,10))
ev

## look at the results (the length of the topNList is shown as column n)
getResults(ev)

## get a confusion matrices averaged over the 10 folds
avg(ev)
plot(ev, annotate = TRUE)

## evaluate several algorithms (including a hybrid recommender) with a list
algorithms <- list(
  RANDOM = list(name = "RANDOM", param = NULL),
  POPULAR = list(name = "POPULAR", param = NULL),
  HYBRID = list(name = "HYBRID", param =
      list(recommenders = list(
          RANDOM = list(name = "RANDOM", param = NULL),
          POPULAR = list(name = "POPULAR", param = NULL)
        )
      )
  )
)

evlist <- evaluate(es, algorithms, n=c(1,3,5,10))
evlist
names(evlist)

## select the first results by index
evlist[[1]]
avg(evlist[[1]])

plot(evlist, legend="topright")

### Evaluate using a data set with real-valued ratings
## Note: we sample only 100 users to make the example run faster
data("Jester5k")
es <- evaluationScheme(Jester5k[1:100], method="split",
  train=.9, given=10, goodRating=5)
## Note: goodRating is used to determine positive ratings

## predict top-N recommendation lists
## (results in TPR/FPR and precision/recall)
ev <- evaluate(es, "RANDOM", type="topNList", n=10)
getResults(ev)

## predict missing ratings
## (results in RMSE, MSE and MAE)
ev <- evaluate(es, "RANDOM", type="ratings")
getResults(ev)