scoring: Predictive Model Assessment with Proper Scoring Rules

Description

Computes scores for the assessment of sharpness of a fitted model for time series of counts.

Usage

"scoring"(object, individual=FALSE, cutoff=1000, ...)
"scoring"(response, pred, distr=c("poisson", "nbinom"), distrcoefs, individual=FALSE, cutoff=1000, ...)

Arguments

object

an object of class "tsglm".

individual

logical. If FALSE (the default) the average scores are returned. Otherwise a matrix with the individual scores for each observation is returned.

cutoff

positive integer. Summation over the infinite sample space {0,1,2,...} of a distribution is cut off at this value. This affects the quadratic, spherical and ranked probability score.

response

integer vector. Vector of observed values $Y[1],...,Y[n]$.

pred

numeric vector. Vector of predicted values $\mu_P[1],...,\mu_P[n]$.

distr

character giving the conditional distribution. Currently implemented are the Poisson ("poisson")and the Negative Binomial ("nbinom") distribution.

distrcoefs

numeric vector of additional coefficients specifying the conditional distribution. For distr="poisson" no additional parameters need to be provided. For distr="nbinom" the additional parameter size needs to be specified (e.g. by distrcoefs=2), see tsglm for details.

...

further arguments are currently ignored. Only for compatibility with generic function.

Value

logarithmic: Logarithmic score
quadratic: Quadratic or Brier score
spherical: Spherical score
rankprob: Ranked probability score
dawseb: Dawid-Sebastiani score
normsq: Normalized squared error score
sqerror: Squared error score

Details

The scoring rules are penalties that should be minimised for a better forecast, so a smaller scoring value means better sharpness. Different competing forecast models can be ranked via these scoring rules. They are computed as follows: For each score $s$ and time $t$ the value $s(P[t],Y[t])$ is computed, where $P[t]$ is the predictive c.d.f. and $Y[t]$ is the observation at time $t$. To obtain the overall score for one model the average of the score of all observations $(1/n) \sum s(P[t],Y[t])$ is calculated.

For all $t \geq 1$, let $p[y]=P(Y[t]=y | F[t-1])$ be the density function of the predictive distribution at $y$ and $||p||^2= \sum p[y]^2$ be a quadratic sum over the whole sample space $y=0,1,2,...$ of the predictive distribution. $\mu_P[t]$ and $\sigma_P[t]$ are the mean and the standard deviation of the predictive distribution, respectively.

Then the scores are defined as follows:

Logarithmic score: $logs(P[t],Y[t])= -log p[y] $

Quadratic or Brier score: $qs(P[t],Y[t])= -2p[y] + ||p||^2$

Spherical score: $sphs(P[t],Y[t])= -p[y] / ||p||$

Ranked probability score: $rps(P[t],Y[t])=\sum (P[t](x) - 1(Y[t]\le x))^2$ (sum over the whole sample space $x=0,1,2,...$)

Dawid-Sebastiani score: $dss(P[t],Y[t]) = ( (Y[t]-\mu_P[t]) / (\sigma_P[t]) )^2 + 2 log \sigma_P[t]$

Normalized squared error score: $nses(P[t],Y[t])= ( (Y[t]-\mu_P[t]) \ (\sigma_P[t]) )^2$

Squared error score: $ses(P[t],Y[t])=(Y[t]-\mu_P[t])^2$

For more information on scoring rules see the references listed below.

References

Christou, V. and Fokianos, K. (2013) On count time series prediction. Journal of Statistical Computation and Simulation (published online), http://dx.doi.org/10.1080/00949655.2013.823612.

Czado, C., Gneiting, T. and Held, L. (2009) Predictive model assessment for count data. Biometrics 65, 1254--1261, http://dx.doi.org/10.1111/j.1541-0420.2009.01191.x.

Gneiting, T., Balabdaoui, F. and Raftery, A.E. (2007) Probabilistic forecasts, calibration and sharpness. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 69, 243--268, http://dx.doi.org/10.1111/j.1467-9868.2007.00587.x.

Examples

Run this code

###Campylobacter infections in Canada (see help("campy"))
campyfit <- tsglm(ts=campy, model=list(past_obs=1, past_mean=c(7,13)))
scoring(campyfit)

Run the code above in your browser using DataLab