Learn R Programming

idiolect (version 1.0.1)

performance: Performance evaluation

Description

This function is used to the test the performance of an authorship analysis method.

Usage

performance(training, test = NULL)

Value

The function returns a list containing a data frame with performance statistics, including an object that can be used to make a tippet plot using the tippet.plot() function of the ROC package (https://github.com/davidavdav/ROC).

Arguments

training

The data frame with the results to evaluate, typically the output of an authorship analysis function, such as impostors(). If only training is present then the function will perform a leave-one-out cross-validation.

test

Optional data frame of results. If present then a calibration model is extracted from training and its performance is evaluated on this data set.

Details

Before applying a method to a real authorship case, it is good practice to test it on known ground truth data. This function performs this test by taking as input either a single table of results or two tables, one for training and one for the test, and then returning as output a list with the following performance statistics: the log-likelihood ratio cost (both \(C_{llr}\) and \(C_{llr}^{min}\)), Equal Error Rate (ERR), the mean values of the log-likelihood ratio for both the same-author (TRUE) and different-author (FALSE) cases, the Area Under the Curve (AUC), Balanced Accuracy, Precision, Recall, F1, and the full confusion matrix. The binary classification statistics are all calculated considering a Log-Likelihood Ratio score of 0 as a threshold.

Examples

Run this code
results <- data.frame(score = c(0.5, 0.2, 0.8, 0.01), target = c(TRUE, FALSE, TRUE, FALSE))
perf <- performance(results)
perf$evaluation

Run the code above in your browser using DataLab