pit_df: Probability Integral Transformation (data.frame Format)

Description

Wrapper around `pit()` for use in data.frames

Usage

pit_df(
  data,
  plot = TRUE,
  full_output = FALSE,
  n_replicates = 100,
  num_bins = NULL,
  verbose = FALSE
)

Arguments

data

a data.frame with the following columns: `true_value`, `prediction`, `sample`

plot

logical. If TRUE, a histogram of the PIT values will be returned as well

full_output

return all individual p_values and computed u_t values for the randomised PIT. Usually not needed.

n_replicates

the number of tests to perform, each time re-randomising the PIT

num_bins

the number of bins in the PIT histogram (if plot == TRUE) If not given, the square root of n will be used

verbose

if TRUE (default is FALSE) more error messages are printed. Usually, this should not be needed, but may help with debugging.

Value

a list with the following components:

data: the input data.frame (not including rows where prediction is `NA`), with added columns `pit_p_val` and `pit_sd`
hist_PIT a plot object with the PIT histogram. Only returned if plot == TRUE. Call plot(PIT(...)$hist_PIT) to display the histogram.
p_values: all p_values generated from the Anderson-Darling tests on the (randomised) PIT. Only returned if full_output = TRUE
u: the u_t values internally computed. Only returned if full_output = TRUE

Details

see pit

References

Sebastian Funk, Anton Camacho, Adam J. Kucharski, Rachel Lowe, Rosalind M. Eggo, W. John Edmunds (2019) Assessing the performance of real-time epidemic forecasts: A case study of Ebola in the Western Area region of Sierra Leone, 2014-15, <doi:10.1371/journal.pcbi.1006785>

Examples

Run this code

# NOT RUN {
example <- scoringutils::continuous_example_data
result <- pit_df(example, full_output = TRUE)

# }

Run the code above in your browser using DataLab