Learn R Programming

surveillance (version 1.8-0)

hhh4_validation: Predictive Model Assessment for hhh4 models

Description

The function oneStepAhead computes successive one-step-ahead predictions for a (random effects) HHH model fitted by hhh4.

The function scores computes a number of (strictly) proper scoring rules based on the one-step-ahead predictions.

See Paul and Held (2011) for further details.

Usage

oneStepAhead(result, tp, type = c("rolling", "first", "final"),
             which.start = c("current", "final"),
             keep.estimates = FALSE, verbose = TRUE, cores = 1)

scores(object, unit = NULL, sign = FALSE, individual = FALSE)

Arguments

result
fitted hhh4 model (class "hhh4").
tp
numeric vector of length 1 or 2. One-step-ahead predictions are computed from time points tp[1], ..., tp[2] (yielding predictions for time points tp[1]+1, ...), where tp[2] defaults to t
type
The default "rolling" procedure sequentially refits the model up to each time point in tp and computes the one-step-ahead predictions for the respective next time point. The alternative types are no true
which.start
Which initial values should be used when successively refitting the model for subsets of the data (up to time point tp[1], up to tp[1]+1, ...) if type="rolling"? Default ("current") is to use
keep.estimates
logical indicating if parameter estimates and log-likelihoods from the successive fits should be returned.
verbose
non-negative integer (usually in the range 0:3) specifying the amount of tracing information to output. During hhh4 model updates, the following verbosity is used: 0 if cores > 1, otherwise <
cores
the number of cores to use when computing the predictions for the set of time points tp in parallel (with mclapply). Note that parallelization is not possible in the defaul
object
result of oneStepAhead.
unit
integer specifying a specific unit for which the scores are computed. If NULL all units are considered.
sign
logical indicating if the sign of the difference between the observed counts and corresponding predictions should be returned.
individual
logical indicating if individual scores for all units or a mean score over all units should be returned.

Value

  • oneStepAhead returns a list with
  • predmatrix of one-step-ahead predictions
  • observedmatrix with observed counts at the predicted time-points
  • psioverdispersion parameter on -log-scale in case of a negative-binomial model and NULL otherwise
  • allConvergedlogical indicating if all successive fits converged
  • If keep.estimates=TRUE, there are the following additional elements:
  • coefficientsmatrix with estimated regression parameters from the successive fits
  • Sigma.origmatrix with estimated variance parameters from the successive fits
  • logliksmatrix with columns "loglikelihood" and "margll" with their obvious meanings
  • scores returns a matrix with the computed logarithmic, ranked probability, squared error, Dawid-Sebastiani, and normalized squared error score as columns.

encoding

latin1

References

Czado, C., Gneiting, T. and Held, L. (2009) Predictive model assessment for count data. Biometrics, 65, 1254--1261

Paul, M. and Held, L. (2011) Predictive assessment of a non-linear random effects model for multivariate time series of infectious disease counts. Statistics in Medicine, 30, 1118--1136

See Also

vignette{"hhh4"} and hhh4.

Examples

Run this code
## univariate salmonella agona data
data(salmonella.agona)
# convert to sts class
salmonella <- disProg2sts(salmonella.agona)

# generate formula for temporal and seasonal trends
f.end <- addSeason2formula(f = ~ 1 + t, S=1, period=52)
model1 <- list(ar = list(f = ~ 1), end = list(f =f.end),
			  family = "NegBin1")
# run model
result <- hhh4(salmonella, model1)

# do one-step-ahead predictions for the last 5 weeks
pred <- oneStepAhead(result, nrow(salmonella)-5,
                     which.start="final", verbose=FALSE)

## test equivalence of parallelized version
if (.Platform$OS.type == "unix" && isTRUE(parallel::detectCores() > 1))
    stopifnot(identical(pred,
        oneStepAhead(result, nrow(salmonella)-5,
                     which.start="final", verbose=FALSE, cores=2)))

# compute mean scores
colMeans(scores(pred))

# type="final" just means fitted values
stopifnot(identical(
    unname(oneStepAhead(result, nrow(salmonella)-5,
                        type="final", verbose=FALSE)$pred),
    unname(tail(fitted(result), 5))))


######################################################################
# Do one-step-ahead predictions for the models from the Paul & Held 
# (2011) paper for the influenza data from Bavaria and Baden-Wuerttemberg 
# (this takes some time!)
######################################################################

## see ?hhh4 for a specification of the models

## do 1-step ahead predictions for the last two years

tp <- nrow(fluBYBW)-2*52

val_A0 <- oneStepAhead(res_A0,tp=tp) 
val_B0 <- oneStepAhead(res_B0,tp=tp) 
val_C0 <- oneStepAhead(res_C0,tp=tp) 

val_A1 <- oneStepAhead(res_A1,tp=tp) 
val_B1 <- oneStepAhead(res_B1,tp=tp) 
val_C1 <- oneStepAhead(res_C1,tp=tp) 

val_A2 <- oneStepAhead(res_A2,tp=tp) 
val_B2 <- oneStepAhead(res_B2,tp=tp) 
val_C2 <- oneStepAhead(res_C2,tp=tp) 

val_D <- oneStepAhead(res_D,tp=tp) 


##################################
## compute scores
################################

#scores
vals <- ls(pattern="val_")

nam <- substring(vals,first=5,last=6)

uni <- NULL
indiv <- TRUE

scores_i <- list()
meanScores <- NULL
for(i in seq(along.with=vals)){
  sc <- scores(get(vals[i]),unit=uni, individual=indiv)
  scores_i[[i]] <- sc
  meanScores <- rbind(meanScores,colMeans(sc))
}

names(scores_i) <- nam
rownames(meanScores) <- nam

##comparison with best model B2 

compareWithBest <- function(best, whichModels, whichScores=1:3, nPermut=9999, seed=1234){
  set.seed(seed)
  pVals <- NULL
  for(score in whichScores){
    p <- c()
    for(model in whichModels){
      if(model==best) p <- c(p,NA)
      else p <- c(p,permutationTest(scores_i[[model]][,score],scores_i[[best]][,score],
     plot=TRUE,nPermutation=nPermut)$pVal.permut)
    }  
    pVals <- cbind(pVals,p)
  }
  return(pVals)
}


pVals_flu <- compareWithBest(best=6, whichModels=1:10,seed=2059710987)
rownames(pVals_flu) <- nam

Run the code above in your browser using DataLab