interval_score: Mean interval score (MIS) for prediction intervals

Description

Computes the mean interval score, a proper scoring rule that penalizes both the width of prediction intervals and any lack of coverage. Lower values indicate better interval quality.

Usage

interval_score(
  truth,
  lower_bound = NULL,
  upper_bound = NULL,
  intervals = NULL,
  return_vector = FALSE,
  alpha,
  na.rm = FALSE
)

Value

A single numeric value representing the mean interval score across all observations.

Arguments

truth: A numeric vector of true outcome values.
lower_bound: A numeric vector of lower bounds of the prediction intervals.
upper_bound: A numeric vector of upper bounds of the prediction intervals.
intervals: Alternative input for prediction intervals as a list-column, where each element is a list with components 'lower_bound' and 'upper_bound'. Useful with non-contigous intervals, for instance constructed using the bin conditional conformal method wich can yield multiple intervals per prediction. See details.
return_vector: Logical, whether to return the interval score vector (TRUE) or the mean interval score (FALSE). Default is FALSE.
alpha: The nominal miscoverage rate (e.g., 0.1 for 90% prediction intervals).
na.rm: Logical, whether to remove NA values before calculation. Default is FALSE.

Details

The mean interval score (MIS) is defined as: $$ MIS = (ub - lb) + \frac{2}{\alpha}(lb - y) \cdot 1_{y < lb} + \frac{2}{\alpha}(y - ub) \cdot 1_{y > ub} $$ where $ y $ is the true value, and $ [lb, ub] $ is the prediction interval.

If the `intervals` argument is provided, it should be a list-column where each element is a list containing 'lower_bound' and 'upper_bound' vectors. This allows for the calculation of coverage for non-contiguous intervals, such as those produced by certain conformal prediction methods such as the bin conditional conformal method. In this case, coverage is determined by checking if the true value falls within any of the specified intervals for each observation. If the user has some observations with contiguous intervals and others with non-contiguous intervals, they can provide both `lower_bound` and `upper_bound` vectors along with the `intervals` list-column. The function will compute coverage accordingly for each observation based on the available information.

Examples

Run this code

library(dplyr)
library(tibble)

# Simulate example data
set.seed(123)
x1 <- runif(1000)
x2 <- runif(1000)
y <- rnorm(1000, mean = x1 + x2, sd = 1)
df <- tibble(x1, x2, y)

# Split into training, calibration, and test sets
df_train <- df %>% slice(1:500)
df_cal <- df %>% slice(501:750)
df_test <- df %>% slice(751:1000)

# Fit a model on the log-scale
mod <- lm(y ~ x1 + x2, data = df_train)

# Generate predictions
pred_cal <- predict(mod, newdata = df_cal)
pred_test <- predict(mod, newdata = df_test)

# Estimate normal prediction intervals from calibration data
intervals <- pinterval_parametric(
  pred = pred_test,
  calib = pred_cal,
  calib_truth = df_cal$y,
  dist = "norm",
  alpha = 0.1
)

# Calculate empirical coverage
interval_score(truth = df_test$y,
         lower_bound = intervals$lower_bound,
         upper_bound = intervals$upper_bound,
         alpha = 0.1)

Run the code above in your browser using DataLab