tsglm
fits a generalised linear model (GLM) for time series of counts.
The specification of the linear predictor allows for regressing on past observations, past values of the linear predictor and covariates as defined in the Details section.
There is the so-called INGARCH model with the identity link (see for example Ferland et al., 2006, Fokianos et al., 2009) and another model with the logarithmic link (see for example Fokianos and Tjostheim, 2011), which also differ in the specification of the linear predictor.
The conditional distribution can be chosen to be either Poisson or negative binomial.Estimation is done by conditional maximum likelihood for the Poisson distribution or by a conditional quasi-likelihood approach based on the Poisson likelihood function for the negative binomial distribution.
There is a vignette available which introduces the functionality of tsglm
and related functions of this package and its underlying statistical methods (vignette("tsglm", package="tscount")
).
The function mean.fit
is a lower level function to fit the mean specification of such a model assuming a Poisson distribution. It is called by tsglm
. It has additional arguments allowing for a finer control of the fitting procedure, which can be handed over from the function tsglm
by its ...
argument. Note that it is usually not necessary for a user to call this lower level functions nor to worry about the additional arguments provided by this function. The defaults of these arguments have been chosen wisely by the authors of this package and should perform well in most applications.
tsglm(ts, model = list(past_obs = NULL, past_mean = NULL, external = NULL), xreg = NULL, link = c("identity", "log"), distr = c("poisson", "nbinom"), ...)
mean.fit(ts, model, xreg, link, score = TRUE, info = c("score", "none", "hessian", "sandwich"), init.method=c("marginal", "iid", "firstobs", "zero"), init.drop = FALSE, epsilon = 1e-06, slackvar = 1e-06, start.control = list(), final.control = list(), inter.control = NULL)
past_obs
past_mean
external
ncol(xreg)
specifying for each covariate wether its effect should be external or not (see Details). If this is a scalar this choice will be used for all covariates. If omitted, all covariates will have an internal effect (i.e. external=FALSE
).
length(ts)
. This is the matrix $X$ (see Details). If omitted no covariates will be included. For the identity link the covariates have to be non-negative.
"identity"
, fitting an INGARCH model. Another possible choice is "log"
, fitting a log-linear model.
"poisson"
, i.e. a Poisson distribution.
mean.fit
. See below.
"score"
(the default) for calculation via the outer product of the score vector, or to "hessian"
for calculation via the Hessian matrix of second derivatives. For info="sandwich"
the information matrix is estimated by a sandwich formula using both the outer score product and the Hessian matrix. If set to "none"
, no information matrix is computed. For distr="nbinom"
one can only use info="score"
.
"marginal"
(the default), the marginal mean of a model without covariates and its derivatives are used. If set to "iid"
, all values are initialised by the marginal mean under the assumption of i.i.d. data, which depends on the intercept only. If set to "firstobs"
the first obersvation is used. If set to "zero"
, the recursions are initialised by the value zero.
TRUE
, the first max(model$past_obs)
observations, which are needed for the autoregression, are not considered. If FALSE
(the default), all observations are considered and pre-sample values determined by the method specified by the argument itit.method
are used for the autoregression. Note that in the first case the effective number of observations used for maximum likelihood estimation is lower than the total number of observations of the original time series. Consequently only this lower number of observations is considered in the output. Note that for init.drop=TRUE
the log-likelihood function for models of different orders might not be comparable if the effective number of observations is different.
x < y
will be transformed to x + slackvar <= y<="" code="">.
=>
use
use = Inf
all observations are used, which is the default.
method
"iid"
, "CSS"
, "CSS-ML"
, "ML"
, "MM"
, "GLM"
and "fixed"
.
If method
is "iid"
(the default), a moment estimator assuming an iid model without covariates is used.
If method="MM"
, the start estimate is an ARMA(1,1) fit by moment estimators and parameters of higher order than one are set to zero. For this method the starting parameter values for the covariates are zero by default and can be set by the list element xreg
.
If method
is "CSS"
, "CSS-ML"
or "ML"
, the start estimate is based on an ARMA fit using the function arima
, and list element method
is passed to its argument of the same name.
If method="GLM"
, the estimated parameters of a generalised linear model with regression on the specified past observations and covariates, but not on past conditional means, are used as start estimates. Initial estimates for the coefficients of past conditional means are set to zero.
If method="fixed"
, parameters given in further named list elements of start.control
are used when available, else the predefined values given in the following are used.
intercept
past_obs
past_mean
xreg
method="MM"
. Default values are zero.
final.control=NULL
, only start estimates are computed and a list with fewer elements which has not the class "tsglm"
is returned. Possible list elements of this argument are:
constrained
constrOptim
with possible elements mu
, outer.iterations
and outer.eps
(see constrOptim
for details). If constrained=NULL
, an unconstrained optimisation is made with function optim
. Note that this is likely to result in a fitted model which is non-stationary, which might cause further problems.
optim.method
constrOptim
or optim
as argument method
. The default is "BFGS"
.
optim.control
constrOptim
or optim
as the argument control
. Must not contain the list element fnscale
. The default is list(maxit=20, reltol=1e-8)
.
final.control
. The default is inter.control=NULL
, which skips this intermediate optimisation step.
"tsglm"
, which is a list with at least the following elements:coef
method.residuals
method.fitted
method.init.drop=TRUE
).logLik
method. This is the complete log-likelihood including all constant terms. It is based on n_eff
observations (see below).distr
.n_obs
if argument init.drop=TRUE
).NULL
in case of a Poisson distribution.ingarch.fit
and loglin.fit
have the same output except the elements distr
, distrcoefs
and sigmasq
. In addition, they return the following list elements:constrOptim
or optim
.constrOptim
or optim
.link="identity"
) used here follows the definition
$$Z_{t}|{\cal{F}}_{t-1} \sim \mathrm{Poi}(\nu_{t}) \quad \mathrm{or} \quad Z_{t}|{\cal{F}}_{t-1} \sim \mathrm{NegBin}(\nu_{t}, \phi),$$
where $F[t-1]$ denotes the history of the process up to time $t-1$, $Poi$ and $NegBin$ is the Poisson respectively the negative binomial distribution with the parametrisation as specified below.
For the model with covariates having an internal effect (the default) the linear predictor of the INGARCH model (which is in that case identical to the conditional mean) is given by
$$\nu_t = \beta_0 + \beta_1 Z_{t-i_1} + \ldots + \beta_p Z_{t-i_p}
+ \alpha_1 \nu_{t-j_1} + \ldots + \alpha_q \nu_{t-j_q}
+ \eta_1 X_{t,1} + \ldots + \eta_r X_{t,r}.$$The log-linear model (argument link="log"
) used here follows the definition
$$Z_{t}|{\cal{F}}_{t-1} \sim \mathrm{Poi}(\lambda_{t}) \quad \mathrm{or} \quad Z_{t}|{\cal{F}}_{t-1} \sim \mathrm{NegBin}(\lambda_{t}, \phi),$$
with $\lambda[t] = \exp(\nu[t])$ and $F[t-1]$ as above.
For the model with covariates having an internal effect (the default) the linear predictor $\nu[t] = \log(\lambda[t])$ of the log-linear model is given by
$$\nu_t = \beta_0 + \beta_1 \log(Z_{t-i_1}+1) + \ldots + \beta_p \log(Z_{t-i_p}+1)
+ \alpha_1 \nu_{t-j_1} + \ldots + \alpha_q \nu_{t-j_q}
+ \eta_1 X_{t,1} + \ldots + \eta_r X_{t,r}.$$
Note that because of the logarithmic link function the effect of single summands in the linear predictor on the conditional mean is multiplicative and hence the parameters play a different role than in the INGARCH model, although they are denoted by the same letters.
The Poisson distribution is parametrised by the mean lambda
according to the definition in Poisson
.
The negative binomial distribution is parametrised by the mean mu
with an additional dispersion parameter size
according to the definition in NegBinomial
. In the notation above its mean parameter mu
is $\nu[t]$ and its dispersion parameter size
is $\phi$.
This function allows to include covariates in two different ways. A covariate can have a so-called internal effect as defined above, where its effect propagates via the regression on past values of the linear predictor and on past observations. Alternatively, it can have a so-called external effect, where its effect does not directly propagates via the feedback on past values of the linear predictor, but only via past observations. For external effects of the covariates, the linear predictor for the model with identity link is given by $$\nu_t = \mu_t + \eta_1 X_{t,1} + \ldots + \eta_r X_{t,r},$$ $$\mu_t = \beta_0 + \beta_1 Z_{t-i_1} + \ldots + \beta_p Z_{t-i_p} + \alpha_1 \mu{t-j_1} + \ldots + \alpha_q \mu{t-j_q},$$ and analoguesly for the model with logarithmic link by $$\nu_t = \mu_t + \eta_1 X_{t,1} + \ldots + \eta_r X_{t,r},$$ $$\mu_t = \beta_0 + \beta_1 \log(Z_{t-i_1}+1) + \ldots + \beta_p \log(Z_{t-i_p}+1) + \alpha_1 \mu{t-j_1} + \ldots + \alpha_q \mu{t-j_q}.$$ This is described in more detail by Liboschik et al. (2014) for the case of deterministic covariates for modelling interventions. It is also possible to model a combination of external and internal covariates, which can be defined straightforwardly by adding each covariate either to the linear predictor $\nu[t]$ itself (for an internal effect) or to $\mu[t]$ defined above (for an external effect).
Christou, V. and Fokianos, K. (2015) Estimation and testing linearity for non-linear mixed poisson autoregressions. Electronic Journal of Statistics 9, 1357--1377, http://dx.doi.org/10.1214/15-EJS1044.
Ferland, R., Latour, A. and Oraichi, D. (2006) Integer-valued GARCH process. Journal of Time Series Analysis 27(6), 923--942, http://dx.doi.org/10.1111/j.1467-9892.2006.00496.x.
Fokianos, K. and Fried, R. (2010) Interventions in INGARCH processes. Journal of Time Series Analysis 31(3), 210--225, http://dx.doi.org/10.1111/j.1467-9892.2010.00657.x.
Fokianos, K., and Fried, R. (2012) Interventions in log-linear Poisson autoregression. Statistical Modelling 12(4), 299--322. http://dx.doi.org/10.1177/1471082X1201200401.
Fokianos, K., Rahbek, A. and Tjostheim, D. (2009) Poisson autoregression. Journal of the American Statistical Association 104(488), 1430--1439, http://dx.doi.org/10.1198/jasa.2009.tm08270.
Fokianos, K. and Tjostheim, D. (2011) Log-linear Poisson autoregression. Journal of Multivariate Analysis 102(3), 563--578, http://dx.doi.org/10.1016/j.jmva.2010.11.002.
Liboschik, T., Kerschke, P., Fokianos, K. and Fried, R. (2014) Modelling interventions in INGARCH processes. International Journal of Computer Mathematics (published online), http://dx.doi.org/10.1080/00207160.2014.949250.
print
, summary
, residuals
, plot
, fitted
, coef
, predict
, logLik
, vcov
, AIC
, BIC
and QIC
for the class "tsglm"
.
The S3 method se
computes the standard errors of the parameter estimates.
Additionally, there are the S3 methods pit
, marcal
and scoring
for predictive model assessment.
S3 methods interv_test
, interv_detect
and interv_multiple
for tests and detection procedures for intervention effects.
tsglm.sim
for simulation from GLM-type model for time series of counts. ingarch.mean
, ingarch.var
and ingarch.acf
for calculation of analytical mean, variance and autocorrelation function of an INGARCH model (i.e. with identity link) without covariates.
Example time series of counts are campy
, ecoli
, ehec
, influenza
, measles
in this package, polio
in package gamlss.data
.
###Campylobacter infections in Canada (see help("campy"))
interventions <- interv_covariate(n=length(campy), tau=c(84, 100),
delta=c(1, 0)) #detected by Fokianos and Fried (2010, 2012)
#Linear link function with Negative Binomial distribution:
campyfit <- tsglm(campy, model=list(past_obs=1, past_mean=13),
xreg=interventions, distr="nbinom")
campyfit
plot(campyfit)
###Road casualties in Great Britain (see help("Seatbelts"))
timeseries <- Seatbelts[, "VanKilled"]
regressors <- cbind(PetrolPrice=Seatbelts[, c("PetrolPrice")],
linearTrend=seq(along=timeseries)/12)
#Logarithmic link function with Poisson distribution:
seatbeltsfit <- tsglm(ts=timeseries, link="log",
model=list(past_obs=c(1, 12)), xreg=regressors, distr="poisson")
summary(seatbeltsfit)
Run the code above in your browser using DataLab