llama (version 0.9.2)

misc: Convenience functions

Description

Convenience functions for computing and working with predictions.

Usage

vbs(data = NULL)
singleBest(data = NULL)
singleBestByCount(data = NULL)
singleBestByPar(data = NULL, factor = 10)
singleBestBySuccesses(data = NULL)
predTable(predictions = NULL, bestOnly = TRUE)

Arguments

data

the data to use. The structure returned by input.

factor

the penalization factor to use for non-successful choices. Default 10.

predictions

the list of predictions.

bestOnly

whether to tabulate only the respective best algorithm for each instance. Default TRUE.

Value

A data frame with the predictions for each instance. The columns of the data frame are the instance ID columns (as determined by input), the algorithm, the score of the algorithm, and the iteration (always 1). The score is 1 if the respective algorithm is chosen for the instance, 0 otherwise. More than one prediction may be made for each instance and iteration.

For predTable, a table.

Details

vbs and singleBest take a data frame of input data and return predictions that correspond to the virtual best and the single best algorithm, respectively. The virtual best picks the best algorithm for each instance. If no algorithm solved in the instance, NA is returned. The single best picks the algorithm that has the best cumulative performance over the entire data set.

singleBestByCount returns the algorithm that has the best performance the highest number of times over the entire data set. Only whether or not an algorithm is the best matters for this, not the difference to other algorithms.

singleBestByPar aggregates the PAR score over the entire data set and returns the algorithm with the lowest overall PAR score. singleBestBySuccesses counts the number of successes over the data set and returns the algorithm with the highest overall number of successes.

predTable tabulates the predicted algorithms in the same way that table does. If bestOnly is FALSE, all algorithms are considered -- for example for regression models, predictions are made for all algorithms, so the table will simply show the number of instances for each algorithm. Set bestOnly to TRUE to tabulate only the best algorithm for each instance.

Examples

Run this code
# NOT RUN {
if(Sys.getenv("RUN_EXPENSIVE") == "true") {
data(satsolvers)

# number of total successes for virtual best solver
print(sum(successes(satsolvers, vbs)))
# number of total successes for single best solver by count
print(sum(successes(satsolvers, singleBestByCount)))

# sum of PAR10 scores for single best solver by PAR10 score
print(sum(parscores(satsolvers, singleBestByPar)))

# number of total successes for single best solver by successes
print(sum(successes(satsolvers, singleBestBySuccesses)))

# print a table of the best solvers per instance
print(predTable(vbs(satsolvers)))
}
# }

Run the code above in your browser using DataCamp Workspace