Evaluate forecasts in a Binary Format
eval_forecasts_binary(
data,
by,
summarise_by,
metrics,
quantiles,
sd,
summarised,
verbose
)
A data.frame or data.table with the predictions and observations. Note: it is easiest to have a look at the example files provided in the package and in the examples below. The following columns need to be present:
true_value
- the true observed values
prediction
- predictions or predictive samples for one
true value. (You only don't need to provide a prediction column if
you want to score quantile forecasts in a wide range format.)
For integer and continuous forecasts a sample
column is needed:
sample
- an index to identify the predictive samples in the
prediction column generated by one model for one true value. Only
necessary for continuous and integer forecasts, not for
binary predictions.
For quantile forecasts the data can be provided in variety of formats. You
can either use a range-based format or a quantile-based format. (You can
convert between formats using quantile_to_range_long
,
range_long_to_quantile
,
sample_to_range_long
,
sample_to_quantile
)
For a quantile-format forecast you should provide:
prediction - prediction to the corresponding quantile
quantile - quantile to which the prediction corresponds
For a range format (long) forecast you need
prediction
the quantile forecasts
boundary
values should be either "lower" or "upper", depending
on whether the prediction is for the lower or upper bound of a given range
range the range for which a forecast was made. For a 50% interval
the range should be 50. The forecast for the 25% quantile should have
the value in the prediction
column, the value of range
should be 50 and the value of boundary
should be "lower".
If you want to score the median (i.e. range = 0
), you still
need to include a lower and an upper estimate, so the median has to
appear twice.
Alternatively you can also provide the format in a wide range format. This format needs
pairs of columns called something like 'upper_90' and 'lower_90', or 'upper_50' and 'lower_50', where the number denotes the interval range. For the median, you need to provide columns called 'upper_0' and 'lower_0'
character vector of columns to group scoring by. This should be the
lowest level of grouping possible, i.e. the unit of the individual
observation. This is important as many functions work on individual
observations. If you want a different level of aggregation, you should use
summarise_by
to aggregate the individual scores.
Also not that the pit will be computed using summarise_by
instead of by
character vector of columns to group the summary by. By
default, this is equal to `by` and no summary takes place.
But sometimes you may want to to summarise
over categories different from the scoring.
summarise_by
is also the grouping level used to compute
(and possibly plot) the probability integral transform(pit).
the metrics you want to have in the output. If `NULL` (the default), all available metrics will be computed.
numeric vector of quantiles to be returned when summarising. Instead of just returning a mean, quantiles will be returned for the groups specified through `summarise_by`. By default, no quantiles are returned.
if TRUE (the default is FALSE) the standard deviation of all metrics will be returned when summarising.
Summarise arguments (i.e. take the mean per group specified in group_by. Default is TRUE.
print out additional helpful messages (default is TRUE)
A data.table with appropriate scores. For more information see
eval_forecasts
# NOT RUN {
# Probability Forecast for Binary Target
binary_example <- data.table::setDT(scoringutils::binary_example_data)
eval <- scoringutils::eval_forecasts(data = binary_example,
summarise_by = c("model"),
quantiles = c(0.5), sd = TRUE,
verbose = FALSE)
# }
Run the code above in your browser using DataLab