surveillance (version 1.12.1)

categoricalCUSUM: CUSUM detector for time-varying categorical time series

Description

Function to process sts object by binomial, beta-binomial or multinomial CUSUM as described by H�hle{Hoehle} (2010). Logistic, multinomial logistic, proportional odds or Bradley-Terry regression models are used to specify in-control and out-of-control parameters. The implementation is illustrated in Salmon et al. (2016).

Usage

categoricalCUSUM(stsObj,control = list(range=NULL,h=5,pi0=NULL,
                 pi1=NULL, dfun=NULL, ret=c("cases","value")),...)

Arguments

stsObj
Object of class sts containing the number of counts in each of the $k$ categories of the response variable. Time varying number of counts $n_t$ is found in slot populationFrac.
control
Control object containing several items
  • range
{Vector of length $t_{max}$ with indices of the observed slot to monitor.} h{Threshold to use for the monitoring. Once the CUSUM

Value

  • An sts object with observed, alarm, etc. slots trimmed to the control$range indices.

encoding

latin1

item

...

code

dfun

Details

The function allows the monitoring of categorical time series as described by regression models for binomial, beta-binomial or multinomial data. The later includes e.g. multinomial logistic regression models, proportional odds models or Bradley-Terry models for paired comparisons. See the H�hle{Hoehle} (2010) reference for further details about the methodology.

Once an alarm is found the CUSUM scheme is resetted (to zero) and monitoring continues from there.

References

H�hle, M. (2010): Online Change-Point Detection in Categorical Time Series. Book chapter in T. Kneib and G. Tutz (Eds.), Statistical Modelling and Regression Structures, Physica-Verlag.

Salmon, M., Schumacher, D. and H�hle{Hoehle}, M. (2016): Monitoring count time series in R: Aberration detection in public health surveillance. Journal of Statistical Software, 70 (10), 1-35. 10.18637/jss.v070.i10

See Also

categoricalCUSUM

Examples

Run this code
if (require("gamlss")) {
  ###########################################################################
  #Beta-binomial CUSUM for a small example containing the time-varying
  #number of positive test out of a time-varying number of total
  #test.
  #######################################

  #Load meat inspection data
  data("abattoir")

  #Use GAMLSS to fit beta-bin regression model
  phase1 <- 1:(2*52)
  phase2  <- (max(phase1)+1) : nrow(abattoir)

  #Fit beta-binomial model using GAMLSS
  abattoir.df <- as.data.frame(abattoir)
  colnames(abattoir.df) <- c("y","t","state","alarm","n")
  m.bbin <- gamlss( cbind(y,n-y) ~ 1 + t +
		    + sin(2*pi/52*t) + cos(2*pi/52*t) +
		    + sin(4*pi/52*t) + cos(4*pi/52*t), sigma.formula=~1,
		    family=BB(sigma.link="log"),
		    data=abattoir.df[phase1,c("n","y","t")])

  #CUSUM parameters
  R <- 2 #detect a doubling of the odds for a test being positive
  h <- 4 #threshold of the cusum

  #Compute in-control and out of control mean
  pi0 <- predict(m.bbin,newdata=abattoir.df[phase2,c("n","y","t")],type="response")
  pi1 <- plogis(qlogis(pi0)+log(R))
  #Create matrix with in control and out of control proportions.
  #Categories are D=1 and D=0, where the latter is the reference category
  pi0m <- rbind(pi0, 1-pi0)
  pi1m <- rbind(pi1, 1-pi1)


  ######################################################################
  # Use the multinomial surveillance function. To this end it is necessary
  # to create a new abattoir object containing counts and proportion for
  # each of the k=2 categories. For binomial data this appears a bit
  # redundant, but generalizes easier to k>2 categories.
  ######################################################################

  abattoir2 <- new("sts",epoch=1:nrow(abattoir), start=c(2006,1),freq=52,
    observed=cbind(abattoir@observed,abattoir@populationFrac -abattoir@observed),
    populationFrac=cbind(abattoir@populationFrac,abattoir@populationFrac),
    state=matrix(0,nrow=nrow(abattoir),ncol=2),
    multinomialTS=TRUE)

  ######################################################################
  #Function to use as dfun in the categoricalCUSUM
  #(just a wrapper to the dBB function). Note that from v 3.0-1 the
  #first argument of dBB changed its name from "y" to "x"!
  ######################################################################
  mydBB.cusum <- function(y, mu, sigma, size, log = FALSE) {
    return(dBB(y[1,], mu = mu[1,], sigma = sigma, bd = size, log = log))
  }


  #Create control object for multinom cusum and use the categoricalCUSUM
  #method
  control <- list(range=phase2,h=h,pi0=pi0m, pi1=pi1m, ret="cases",
		   dfun=mydBB.cusum)
  surv <- categoricalCUSUM(abattoir2, control=control,
			   sigma=exp(m.bbin$sigma.coef))

  #Show results
  plot(surv[,1],dx.upperbound=0)
  lines(pi0,col="green")
  lines(pi1,col="red")

  #Index of the alarm
  which.max(alarms(surv[,1]))
}

#ToDo: Compute run length using LRCUSUM.runlength

Run the code above in your browser using DataCamp Workspace