interv_multiple.tsglm: Detecting Multiple Interventions in Count Time Series Following Generalised Linear Models

Description

Iterative detection procedure for multiple interventions of unknown types occuring at unknown times as proposed by Fokianos and Fried (2010, 2012).

Usage

"interv_multiple"(fit, taus=2:length(fit$ts), deltas=c(0,0.8,1), external=FALSE, B=10, signif_level=0.05, start.control_bootstrap, final.control_bootstrap, inter.control_bootstrap, parallel=FALSE, ...)

Arguments

fit

an object of class "tsglm". Usually the result of a call to tsglm.

taus

integer vector of times which are considered for the possible intervention to occur. Default is to consider all times.

deltas

numeric vector that determines the types of intervention to be considered (see Details).

external

logical value specifying wether the interventions effect is external or not (see Details).

positive integer value giving the number of bootstrap samples for estimation of the p-value.

signif_level

numeric value with

0 <= signif_level="" <="1 giving a significance level for the procedure.

start.control_bootstrap

named list that determines how to make initial estimation in the bootstrap, see argument start.control in tsglm. If missing, the same settings as for the regular estimation are used.

final.control_bootstrap

named list that determines how to make final maximum likelihood estimation in the bootstrap, see argument final.control in tsglm. If missing, the same settings as for the regular estimation are used. If final.control_bootstrap=NULL, then the model is not re-fitted for each bootstrap sample. Instead the parameters of the original fit which have been used for simulating the bootstrap samples are used. This approach saves computation time at the cost of a more conservative procedure, see Fokianos and Fried (2012).

inter.control_bootstrap

named list determining how to maximise the log-likelihood function in an intermediate step, see argument inter.control in tsglm. If missing, the same settings as for the regular estimation are used.

parallel

logical value. If parallel=TRUE, the bootstrap is distributed to multiple cores parallely. Requires a computing cluster to be initialised and registered as the default cluster by makeCluster and setDefaultCluster from package parallel.

...

additional arguments passed to the function for detection of single intervention effects interv_detect and via this function some of the arguments are passed to the fitting function tsglm.

Value

interventions: data frame giving the detected interventions, which has the variables tau, delta, size, test_statistic and p-value.
fit_H0: object of class "tsglm" with the fitted model under the null hypothesis of no intervention, see tsglm.
fit_cleaned: object of class "tsglm" with the fitted model for the cleanded time series after the last step of the iterative procedure, see tsglm.
model_interv: model specification of the model with all detected interventions at their respective times.
fit_interv: object of class "tsglm" with the fitted model with all detected interventions at their respective times, see tsglm.
track: named list of matrices with the detailed results of the iterative detection procedure. Element tau_max gives the times where the test statistic has its maximum for each type of intervention and in each iteration step and element size gives the estimated sizes of the respective intervention effects. Elements test_statistic and p_value require no further explanation.

Details

This function performs an iterative procedure for detection of multiple intervention effects. In each step the function interv_detect is applied for each of the possible intervention types provided in the argument deltas. If there is (after a Bonferroni correction) no significant intervention effect the procedure stops. Otherwise the type of intervention with the minimum p-value is chosen. In case of equal p-values preference is given to a level shift (i.e. $\delta=1$) and then to the type of intervention with the largest test statistic. The effect of the chosen intervention is removed from the time series. The time series cleaned from the intervention effect is tested for further interventions in a next step.

For each time in taus the test statistic of a score test on an intervention effect occuring at that time is computed, see interv_test. The time with the maximum test statistic is considered as a candidate for a possible intervention effect at that time. The type of the intervention effect is specified by delta as described in interv_covariate. The intervention is included as an additional covariate according to the definition in tsglm. It can have an internal (the default) or external (external=TRUE) effect (see Liboschik et al., 2014).

All p-values given in the output are multiplied by the number of intervention types considered to account for the multiple testing in each step by a Bonferroni correction. Note that this correction can lead to p-values greater than one.

Note that this bootstrap procedure is very time-consuming.

References

Fokianos, K. and Fried, R. (2010) Interventions in INGARCH processes. Journal of Time Series Analysis 31(3), 210--225, http://dx.doi.org/10.1111/j.1467-9892.2010.00657.x.

Fokianos, K., and Fried, R. (2012) Interventions in log-linear Poisson autoregression. Statistical Modelling 12(4), 299--322. http://dx.doi.org/10.1177/1471082X1201200401.

Liboschik, T., Kerschke, P., Fokianos, K. and Fried, R. (2013) Modelling interventions in INGARCH processes. SFB 823 Discussion Paper 03/13, http://hdl.handle.net/2003/29878.

Examples

Run this code

## Not run: 
# ###Campylobacter infections in Canada (see help("campy"))
# #Searching for potential intervention effects (runs several hours!):
# campyfit <- tsglm(ts=campy, model=list(past_obs=1, past_mean=c(7,13)))
# campyfit_intervmultiple <- interv_multiple(fit=campyfit, taus=80:120,
#                               deltas=c(0,0.8,1), B=500, signif_level=0.05)
# campyfit_intervmultiple
# plot(campyfir_intervmultiple)
# #Parallel computation for shorter run time on a cluster:
# library(parallel)
# ntasks <- 3
# clust <- makeCluster(ntasks)
# setDefaultCluster(cl=clust)
# interv_multiple(fit=campyfit, taus=80:120, deltas=c(0,0.8,1), B=500,
#                 signif_level=0.05, parallel=TRUE)## End(Not run)

Run the code above in your browser using DataLab