logloss: Log Loss

Description

Measure to compare true observed labels with predicted probabilities in multiclass classification tasks.

Usage

logloss(truth, prob, sample_weights = NULL, eps = 1e-15, ...)

Value

Performance value as numeric(1).

Arguments

truth: (factor())
True (observed) labels. Must have the same levels and length as response.
prob: (matrix())
Matrix of predicted probabilities, each column is a vector of probabilities for a specific class label. Columns must be named with levels of truth.
sample_weights: (numeric())
Vector of non-negative and finite sample weights. Must have the same length as truth. The vector gets automatically normalized to sum to one. Defaults to equal sample weights.
eps: (numeric(1))
Probabilities are clipped to max(eps, min(1 - eps, p)). Otherwise the measure would be undefined for probabilities p = 0 and p = 1.
...: (any)
Additional arguments. Currently ignored.

Meta Information

Type: "classif"
Range: $[0, \infty)$
Minimize: TRUE
Required prediction: prob

Details

The Log Loss is defined as $$ -\frac{1}{n} \sum_{i=1}^n w_i \log \left( p_i \right ) $$ where $p_i$ is the probability for the true class of observation $i$.

Examples

Run this code

set.seed(1)
lvls = c("a", "b", "c")
truth = factor(sample(lvls, 10, replace = TRUE), levels = lvls)
prob = matrix(runif(3 * 10), ncol = 3, dimnames = list(NULL, lvls))
prob = t(apply(prob, 1, function(x) x / sum(x)))
logloss(truth, prob)

Run the code above in your browser using DataLab