QIC: Quasi Information Criterion (QIC) for glmstarma and dglmstarma objects

Description

Generic function to compute the QIC (Pan, 2001), a model selection criterion commonly used for Generalized Estimating Equations (GEE) and related models.

Usage

QIC(object, ...)
# S3 method for glmstarma
QIC(object, adjust = TRUE, ...)
# S3 method for dglmstarma
QIC(object, adjust = TRUE, ...)

Value

A numeric value for the QIC.

Arguments

object: Object of class glmstarma or dglmstarma.
...: Additional arguments passed to specific methods.
adjust: Logical; if TRUE (default), an adjustment for the temporal orders of the model is applied to the likelihood. See Details.

Details

The quasi information criterion (QIC) has been proposed by Pan (2001) as alternative to Akaike's information criterion (AIC) which is properly adjusted for regression analysis based on the generalized estimating equations (GEE). It is defined as $$QIC = -2 \cdot \ell + 2 \cdot \left(\mathrm{trace}(G_{\mu}^{-1} H_{\mu}) + \mathrm{trace}(G_{\phi}^{-1} H_{\phi})\right),$$ where $\ell$ is the (quasi-)log-likelihood of the estimated model, $G_{\mu}$ is the expected information matrix of the regression parameters of the mean model, and $H_{\mu}$ the empirical covariance matrix of the regression parameters of the mean model. Similarly, $G_{\phi}$ and $H_{\phi}$ denote the corresponding matrices for the dispersion model (only for dglmstarma objects). For glmstarma objects, the second term reduces to $\mathrm{trace}(G_{\mu}^{-1} H_{\mu})$.

For more details on the calculation of $G$ and $H$, see sandwich_variance

During model estimation, the (quasi-)log-likelihood is computed only on the last n_eff time-points, where n_eff = n - max_time_lag_mean - max_time_lag_dispersion. Here n is the total number of time-points, max_time_lag_mean the maximum temporal lag in the mean model, and max_time_lag_dispersion the maximum temporal lag in the dispersion model (for dglmstarma objects). If no dispersion model is present (class glmstarma), max_time_lag_dispersion is zero.

To be more specific the (quasi-)log-likelihood calculated during model estimation is given by $$\ell(\mathbf{\theta}) = \sum_{t = \tau}^n \sum_{i = 1}^p \ell_{i, t}(\mathbf{\theta}),$$ where $\ell_{i, t}(\mathbf{\theta})$ denotes the (quasi-)log-likelihood of the observation at location $i$ at time $t$, and $\tau = n - n_{\mathrm{eff}}$.

This calculation of the (quasi-)log-likelihood introduces bias when comparing models of different temporal orders. If adjust = TRUE, the (quasi-)log-likelihood is rescaled to n observations by multiplying with $n / n_{\mathrm{eff}}$, before calculating the QIC.

References

Pan, W. (2001). Akaike's Information Criterion in Generalized Estimating Equations. Biometrics, 57(1), 120–125. tools:::Rd_expr_doi("10.1111/j.0006-341X.2001.00120.x")

Examples

Run this code

# \donttest{
dat <- load_data("chickenpox", directory = tempdir())
chickenpox <- dat$chickenpox
population_hungary <- dat$population_hungary
W_hungary <- dat$W_hungary

model_autoregressive <- list(past_obs = rep(1, 7))
fit <- glmstarma(chickenpox, model_autoregressive, W_hungary, family = vpoisson("log"),
                 covariates = list(population = population_hungary))
QIC(fit)

mean_model <- list(past_obs = rep(1, 7))
dispersion_model <- list(past_obs = 1)
fit2 <- dglmstarma(chickenpox, mean_model, dispersion_model, mean_family = vquasipoisson("log"),
                   dispersion_link = "log",
                   wlist = W_hungary, 
                   mean_covariates = list(population = population_hungary))
QIC(fit2)
# }

Run the code above in your browser using DataLab