This function compares two models based on the subset of forecasts for which
both models have made a prediction. It gets called
from pairwise_comparison_one_group
, which handles the
comparison of multiple models on a single set of forecasts (there are no
subsets of forecasts to be distinguished). pairwise_comparison_one_group
in turn gets called from from pairwise_comparison
which can handle
pairwise comparisons for a set of forecasts with multiple subsets, e.g.
pairwise comparisons for one set of forecasts, but done separately for two
different forecast targets.
compare_two_models(scores, name_model1, name_model2, metric, test_options, by)
A data.frame of unsummarised scores as produced by
eval_forecasts
character, name of the first model
character, name of the model to compare against
A character vector of length one with the metric to do the comparison on.
list with options to pass down to compare_two_models
.
To change only one of the default options, just pass a list as input with
the name of the argument you want to change. All elements not included in the
list will be set to the default (so passing an empty list would result in the
default options).
character vector of columns to group scoring by. This should be the
lowest level of grouping possible, i.e. the unit of the individual
observation. This is important as many functions work on individual
observations. If you want a different level of aggregation, you should use
summarise_by
to aggregate the individual scores.
Also not that the pit will be computed using summarise_by
instead of by