parametric: Scoring Rules for Parametric Forecast Distributions

Description

Compute scores of the form \(S(y, F)\), where \(S\) is a proper scoring rule, \(y\) is a vector of realizations, and \(F\) belongs to a parametric family of distributions.

Usage

crps(y, family, ...)
logs(y, family, ...)

Arguments

Vector of realized values

family

String which specifies the parametric family; currently implemented: "beta", "exponential", "gamma", "gev", "gpd", "laplace", "log-laplace", "log-logistic", "log-normal", "logistic", "mixture-normal", "negative-binomial", "normal", "poisson", "t", "two-piece-normal", "uniform".

...

Vectors of parameter values; expected input depends on chosen family. See details below.

Value

Vector of values of the score. A lower score indicates a better forecast.

Details

The parameters supplied to each of the functions are numeric vectors:

Distributions defined on the real line:
- "laplace" or "lapl": location (real-valued location parameter), scale (positive scale parameter); see flapl
- "logistic" or "logis": location (real-valued location parameter), scale (positive scale parameter); see Logistic
- "normal" or "norm": mean, sd (mean and standard deviation); see Normal
- "t": location (real-valued location parameter), scale (positive scale parameter), df (degrees of freedom); see ft
- "normal-mixture" or "mixnorm": m (mean parameters), s (standard deviations), w (weights); see fmixnorm; note: matrix-input for parameters
- "two-piece-exponential" or "2pexp": location (real-valued location parameter), scale1, scale2 (positive scale parameters); see f2pexp
- "two-piece-normal" or "2pnorm": location (real-valued location parameter), scale1, scale2 (positive scale parameters); see f2pnorm
Distributions for non-negative random variables:
- "exponential" or "exp": rate (positive rate parameter); see Exponential
- "gamma": shape (positive shape parameter), rate (positive rate parameter), scale (alternative to rate); see GammaDist
- "log-laplace" or "llapl": locationlog (real-valued location parameter), scalelog (positive scale parameter); see fllapl
- "log-logistic" or "llogis": locationlog (real-valued location parameter), scalelog (positive scale parameter); see fllogis
- "log-normal" or "lnorm": locationlog (real-valued location parameter), scalelog (positive scale parameter); see Lognormal
Distributions for random variables with variable support:
- "normal" or "norm": location (location parameter), scale (scale parameter), lower (real-valued truncation parameter, lower bound), upper (real-valued truncation parameter, upper bound), lmass (point mass in lower bound, string "cens" or "trunc"), umass (point mass in upper bound, string "cens" or "trunc"); see fnorm
- "t": location (location parameter), scale (scale parameter), df (degrees of freedom), lower (real-valued truncation parameter, lower bound), upper (real-valued truncation parameter, upper bound), lmass (point mass in lower bound, string "cens" or "trunc"), umass (point mass in upper bound, string "cens" or "trunc"); see ft
- "logistic" or "logis": location (location parameter), scale (scale parameter), lower (real-valued truncation parameter, lower bound), upper (real-valued truncation parameter, upper bound), lmass (point mass in lower bound, string "cens" or "trunc"), umass (point mass in upper bound, string "cens" or "trunc"); see flogis
- "exponential" or "exp": location (real-valued location parameter), scale (positive scale parameter), mass (point mass in location); see fexp
- "gpd": location (real-valued location parameter), scale (positive scale parameter), shape (real-valued shape parameter), mass (point mass in location); see fgpd
- "gev": location (real-valued location parameter), scale (positive scale parameter), shape (real-valued shape parameter); see fgev
Distribution for random variables defined on bounded intervals:
- "uniform" or "unif": min, max (lower and upper boundaries), lmass, umass (point mass in lower or upper boundary); see Uniform
- "beta": shape1, shape2 (positive parameters); see Beta
Distributions for random variables with discrete / infinite support:
- "poisson" or "pois": lambda (positive mean); see Poisson
- "negative-binomial" or "nbinom": size (positive dispersion parameter), prob (success probability), mu (mean, alternative to prob); see NegBinomial

All numerical arguments should be of the same length. An exception are scalars of length 1, which will be recycled.

References

General background and further references on scoring rules:

Gneiting, T. and A. Raftery (2007): `Strictly proper scoring rules, prediction and estimation', Journal of the American Statistical Association 102, 359-378.

Gneiting, T. and M. Katzfuss (2014): `Probabilistic forecasting', Annual Review of Statistics and Its Application 1, 125-151.

Closed form expressions of the CRPS for specific distributions:

Baran, S. and S. Lerch (2015): `Log-normal distribution based Ensemble Model Output Statistics models for probabilistic wind-speed forecasting', Quarterly Journal of the Royal Meteorological Society 141, 2289-2299. (Lognormal)

Friederichs, P. and T.L. Thorarinsdottir (2012): `Forecast verification for extreme value distributions with an application to probabilistic peak wind prediction', Environmetrics 23, 579-594. (Generalized Extreme Value, Generalized Pareto)

Gneiting, T., Larson, K., Westvelt III, A.H. and T. Goldman (2005): `Calibrated probabilistic forecasting using ensemble model output statistics and minimum CRPS estimation', Monthly Weather Review 133, 1098-1118. (Normal)

Gneiting, T., Larson, K., Westrick, K., Genton, M.G. and E. Aldrich (2006): `Calibrated probabilistic forecasting at the stateline wind energy center: The regime-switching space-time method', Journal of the American Statistical Association 101, 968-979. (Censored normal)

Gneiting, T. and T.L. Thorarinsdottir (2010): `Predicting inflation: Professional experts versus no-change forecasts', arXiv preprint arXiv:1010.2318. (Two-piece normal)

Grimit, E.P., Gneiting, T., Berrocal, V.J. and N.A. Johnson (2006): `The continuous ranked probability score for circular variables and its application to mesoscale forecast ensemble verification', Quarterly Journal of the Royal Meteorological Society 132, 2925-2942. (Mixture of normals)

Scheuerer, M. and D. Moeller (2015): `Probabilistic wind speed forecasting on a grid based on ensemble model output statistics', Annals of Applied Statistics 9, 1328-1349. (Gamma)

Thorarinsdottir, T.L. and T. Gneiting (2010): `Probabilistic forecasts of wind speed: ensemble model output statistics by using heteroscedastic censored regression', Journal of the Royal Statistical Society (Series A) 173, 371-388. (Truncated normal)

Wei, W. and L. Held (2014): `Calibration tests for count data', TEST 23, 787-205. (Poisson, Negative Binomial)

Independent listing of closed-form solutions for the CRPS:

Taillardat, M., Mestre, O., Zamo, M. and P. Naveau (2016): `Calibrated ensemble forecasts using quantile regression forests and ensemble model output statistics', Monthly Weather Review, in press.

Examples

Run this code

# NOT RUN {
crps(y = 1, family = "normal", mean = 0, sd = 2)
logs(y = 1, family = "normal", mean = 0, sd = 2)

crps(y = rnorm(20), family = "normal", mean = 1:20, sd = sqrt(1:20))
logs(y = rnorm(20), family = "normal", mean = 1:20, sd = sqrt(1:20))

## Arguments can have different lengths:
crps(y = rnorm(20), family = "normal", mean = 0, sd = 2)
crps(y = 1, family = "normal", mean = 1:20, sd = sqrt(1:20))

## Lists are accepted as input if elements with the required names are present:
df <- data.frame(y = rnorm(20), mean = 1:20, sd = 1)
crps(df, family = "normal")
logs(df, family = "normal")

## Mixture of normal distributions requires matrix input for parameters:
mval <- matrix(rnorm(20*50), nrow = 20)
sdval <- matrix(runif(20*50, min = 0, max = 2), nrow = 20)
weights <- matrix(rep(1/50, 20*50), nrow = 20)
crps(y = rnorm(20), family = "mixnorm", m = mval, s = sdval, w = weights)
# }

Run the code above in your browser using DataLab