scores_sample_univ_weighted: Weighted Scoring Rules for Simulated Forecast Distributions

Description

Calculate weighted scores given observations and draws from univariate predictive distributions. The weighted scoring rules that are available are the threshold-weighted CRPS, outcome-weighted CRPS, and conditional and censored likelihood scores.

Usage

twcrps_sample(
  y,
  dat,
  a = -Inf,
  b = Inf,
  chain_func = NULL,
  w = NULL,
  show_messages = TRUE
)
owcrps_sample(
  y,
  dat,
  a = -Inf,
  b = Inf,
  weight_func = NULL,
  w = NULL,
  show_messages = TRUE
)
clogs_sample(
  y,
  dat,
  a = -Inf,
  b = Inf,
  bw = NULL,
  show_messages = FALSE,
  cens = TRUE
)

Value

Value of the score. A lower score indicates a better forecast.

Arguments

y: vector of realized values.
dat: vector or matrix (depending on y; see details) of simulation draws from forecast distribution.
a: numeric lower bound for the indicator weight function w(z) = 1{a < z < b}.
b: numeric upper bound for the indicator weight function w(z) = 1{a < z < b}.
chain_func: function used to target particular outcomes in the threshold-weighted CRPS; the default corresponds to the weight function w(z) = 1{a < z < b}.
w: optional; vector or matrix (matching dat) of ensemble weights. Note that these weights are not used in the weighted scoring rules; see details.
show_messages: logical; display of messages (does not affect warnings and errors).
weight_func: function used to target particular outcomes in the outcome-weighted CRPS; the default corresponds to the weight function w(z) = 1{a < z < b}.
bw: optional; vector (matching y) of bandwidths for kernel density estimation for clogs_sample; see details.
cens: logical; if TRUE, clogs_sample returns the censored likelihood score; if FALSE, clogs_sample returns the conditional likelihood score.

Author

Sam Allen

Details

For a vector y of length n, dat should be given as a matrix with n rows. If y has length 1, then dat may be a vector.

twcrps_sample transforms y and dat using the chaining function chain_func and then calls crps_sample. owcrps_sample weights y and dat using the weight function weight_func and then calls crps_sample. See the documentation for crps_sample for further details.

The default weight function used in the weighted scores is w(z) = 1{a < z < b}, which is equal to one if z is between a and b, and zero otherwise. This weight function emphasises outcomes between a and b, and is commonly used in practical applications when interest is on values above a threshold (set b = Inf and a equal to the threshold) or below a threshold (set a = -Inf and b equal to the threshold).

Alternative weight functions can also be employed using the chain_func and weight_func arguments to twcrps_sample and owcrps_sample, respectively. Computation of the threshold-weighted CRPS for samples from a predictive distribution requires a chaining function rather than a weight function. This is why a chaining function is an input for twcrps_sample whereas a weight function is an input for owcrps_sample. Since clogs_sample requires kernel density estimation to approximate the forecast density, it cannot readily be calculated for arbitrary weight functions, and is thus only available for the canonical weight function w(z) = 1{a < z < b}.

The chain_func and weight_func arguments are functions that will be applied to the vector y and the columns of dat. It is assumed that these functions are vectorised. Both functions must take a vector as an input and output a vector of the same length, containing the weight (for weight_func) or transformed value (for chain_func) corresponding to each element in the input vector. An error will be returned if weight_func returns negative values, and a warning message will appear if chain_func is not increasing.

If no custom argument is given for a, b, chain_func or weight_func, then both twcrps_sample and owcrps_sample are equivalent to the standard unweighted crps_sample, and clogs_sample is equivalent to logs_sample.

The w argument is also present in the unweighted scores (e.g. crps_sample). w is used to weight the draws from the predictive distribution, and does not weight particular outcomes within the weighted scoring rules. This should not be confused with the weight_func argument, which is used within the weighted scores.

References

Allen, S. (2024): `Weighted scoringRules: Emphasising Particular Outcomes when Evaluating Probabilistic Forecasts', Journal of Statistical Software. tools:::Rd_expr_doi("10.18637/jss.v110.i08")

Threshold-weighted CRPS:

Gneiting, T. and R. Ranjan (2011): `Comparing density forecasts using threshold-and quantile-weighted scoring rules', Journal of Business & Economic Statistics 29, 411-422. tools:::Rd_expr_doi("10.1198/jbes.2010.08110")

Allen, S., Ginsbourger, D. and J. Ziegel (2023): `Evaluating forecasts for high-impact events using transformed kernel scores', SIAM/ASA Journal on Uncertainty Quantification 11, 906-940. tools:::Rd_expr_doi("10.1137/22M1532184")

Outcome-weighted CRPS:

Holzmann, H. and B. Klar (2017): `Focusing on regions of interest in forecast evaluation', Annals of Applied Statistics 11, 2404-2431. tools:::Rd_expr_doi("10.1214/17-AOAS1088")

Conditional and censored likelihood scores:

Diks, C., Panchenko, V. and D. Van Dijk (2011): `Likelihood-based scoring rules for comparing density forecasts in tails', Journal of Econometrics 163, 215-230. tools:::Rd_expr_doi("10.1016/j.jeconom.2011.04.001")

Examples

Run this code

if (FALSE) {

y <- rnorm(10)
sample_fc <- matrix(rnorm(100), nrow = 10)

crps_sample(y = y, dat = sample_fc)
twcrps_sample(y = y, dat = sample_fc)
owcrps_sample(y = y, dat = sample_fc)

logs_sample(y = y, dat = sample_fc)
clogs_sample(y = y, dat = sample_fc)
clogs_sample(y = y, dat = sample_fc, cens = FALSE)

# emphasise outcomes above 0
twcrps_sample(y = y, dat = sample_fc, a = 0)
owcrps_sample(y = y, dat = sample_fc, a = 0)
clogs_sample(y = y, dat = sample_fc, a = 0)
clogs_sample(y = y, dat = sample_fc, a = 0, cens = FALSE)

# emphasise outcomes below 0
twcrps_sample(y = y, dat = sample_fc, b = 0)
owcrps_sample(y = y, dat = sample_fc, b = 0)
clogs_sample(y = y, dat = sample_fc, b = 0) 

# emphasise outcomes between -1 and 1
twcrps_sample(y = y, dat = sample_fc, a = -1, b = 1)
owcrps_sample(y = y, dat = sample_fc, a = -1, b = 1)
clogs_sample(y = y, dat = sample_fc, a = -1, b = 1)


# a must be smaller than b 
twcrps_sample(y = y, dat = sample_fc, a = 1, b = -1) # error
owcrps_sample(y = y, dat = sample_fc, a = 0, b = 0) # error
clogs_sample(y = y, dat = sample_fc, a = 10, b = 9) # error

# a and b must be single numeric values (not vectors)
twcrps_sample(y = y, dat = sample_fc, a = rnorm(10)) # error


# the owCRPS is not well-defined if none of dat are between a and b
y <- rnorm(10)
sample_fc <- matrix(runif(100, -5, 1), nrow = 10)
owcrps_sample(y = y, dat = sample_fc, a = 1)
# the twCRPS is zero if none of y and dat are between a and b
twcrps_sample(y = y, dat = sample_fc, a = 1) 


# alternative custom weight and chaining functions can also be used

# Example 1: a Gaussian weight function with location mu and scale sigma
mu <- 0
sigma <- 0.5
weight_func <- function(x) pnorm(x, mu, sigma)
# or weight_func <- get_weight_func("norm_cdf", mu, sigma)
# a corresponding chaining function is
chain_func <- function(x) (x - mu)*pnorm(x, mu, sigma) + (sigma^2)*dnorm(x, mu, sigma)
# or chain_func <- get_weight_func("norm_cdf", mu, sigma, weight = FALSE)

x <- seq(-2, 2, 0.01)
plot(x, weight_func(x), type = "l") # positive outcomes are given higher weight
plot(x, chain_func(x), type = "l") 

owcrps_sample(y = y, dat = sample_fc, a = mu)
owcrps_sample(y = y, dat = sample_fc, weight_func = weight_func)
twcrps_sample(y = y, dat = sample_fc, a = mu)
twcrps_sample(y = y, dat = sample_fc, chain_func = chain_func)


# Example 2: a sigmoid (or logistic) weight function with location mu and scale sigma
weight_func <- function(x) plogis(x, mu, sigma)
# or weight_func <- get_weight_func("logis_cdf", mu, sigma)
chain_func <- function(x) sigma*log(exp((x - mu)/sigma) + 1)
# or chain_func <- get_weight_func("logis_cdf", mu, sigma, weight = FALSE)

x <- seq(-2, 2, 0.01)
plot(x, weight_func(x), type = "l") # positive outcomes are given higher weight
plot(x, chain_func(x), type = "l") 

owcrps_sample(y = y, dat = sample_fc, a = mu)
owcrps_sample(y = y, dat = sample_fc, weight_func = weight_func)
twcrps_sample(y = y, dat = sample_fc, a = mu)
twcrps_sample(y = y, dat = sample_fc, chain_func = chain_func)


# Example 3: the weight function w(z) = 1{z < a or z > b}
a <- -1
b <- 1
weight_func <- function(x) as.numeric(x < a | x > b)
chain_func <- function(x) (x < a)*(x - a) + (x > b)*(x - b) + a

x <- seq(-2, 2, 0.01)
plot(x, weight_func(x), type = "l")
plot(x, chain_func(x), type = "l")

owcrps_sample(y = y, dat = sample_fc, weight_func = weight_func)
twcrps_sample(y = y, dat = sample_fc, chain_func = chain_func)
twcrps_sample(y = y, dat = sample_fc, b = -1) + twcrps_sample(y = y, dat = sample_fc, a = 1)
crps_sample(y = y, dat = sample_fc) - twcrps_sample(y = y, dat = sample_fc, a = -1, b = 1)
}

Run the code above in your browser using DataLab