ST.cluster: Fit the space-time Poisson log-linear clustering model proposed by Lee and Lawson (2014).

Description

The function fits the Bayesian spatio-temporal clustering model proposed by Lee and Lawson (2014) to Poisson count data. The natural log of the linear predictor is made up of a covariate component and a clustering component, the latter of which represents the spatial mean surface with a piecewise constant term with G levels or classes. The mean in each class changes over time, and each area is allowed to change class over time. The mean in each class and the allocation of an area to a class are both temporally correlated. Inference is based on Markov chain Monte Carlo (McMC) simulation, using a combination of Gibbs sampling and Metropolis steps.

Usage

ST.cluster(formula, K, data=NULL, G, burnin=0, n.sample=1000, thin=1,  
prior.mean.beta=NULL, prior.var.beta=NULL, prior.sigma2=NULL, prior.alpha=NULL, 
prior.delta=NULL, verbose=TRUE)

Arguments

formula

A formula for the covariate part of the model, using the same notation as for the lm() function. The offsets should also be included here using the offset() function. The response and each covariate should be vectors of length (KN)*1, where K is the numbe

The number of spatial units in the spatio-temporal data vectors.

data

A data.frame containing the variables in the formula.

The maximum number of risk classes. Note, some classes can be estimated as being empty.

burnin

The number of MCMC samples to discard as the burnin period. Defaults to 0.

n.sample

The number of MCMC samples to generate. Defaults to 1,000.

thin

The level of thinning to apply to the MCMC samples to reduce their temporal autocorrelation. Defaults to 1.

prior.mean.beta

A vector of prior means for the regression parameters beta (Gaussian priors are assumed). Defaults to a vector of zeros.

prior.var.beta

A vector of prior variances for the regression parameters beta (Gaussian priors are assumed). Defaults to a vector with values 1000.

prior.sigma2

The prior shape and scale in the form of c(shape, scale) for an Inverse-Gamma(shape, scale) prior for the random walk prior for the risk levels. Defaults to c(0.001, 0.001).

prior.alpha

The prior maximum for a Uniform(0, prior.alpha) prior for the parameter alpha. This parameter controls the temporal correlation between the allocation variables for each area to a risk class over time. Defaults to 10.

prior.delta

The prior maximum for a Uniform(0, prior.delta) prior for the parameter delta. This parameter controls the penalty constraint for the allocation variables for each area to a risk class. Defaults to 10.

verbose

Logical, should the function update the user on its progress.

Value

formulaThe formula for the covariate and offset part of the model.
samplesA list containing the MCMC samples from the model.
fitted.valuesA vector containing the fitted value for each area and time point. The vector is ordered so that all spatial units for time period one come first and then time period two and so on.
residualsA vector containing the residuals for each area and time point. The vector is ordered so that all spatial units for time period one come first and then time period two and so on.
stepchangeA K*N matrix giving the posterior median of the allocation variable Z assigning a data point to a risk class. Each row corresponds to an areal unit and each column to a time period.
modelfitModel fit criteria including the Deviance Information Criterion (DIC), the effective number of parameters in the model (p.d), and the log Marginal Predictive Likelihood (LMPL).
summary.resultsA table summarising some of the parameters in the model.
modelA text string describing the model.
acceptThe acceptance probabilities for the parameters.

Details

For further details about how to apply the function see the examples below.

References

Lee, D. and A. Lawson (2014). Cluster detection and risk estimation for spatio-temporal health data. arXiv:1408.1191.

Examples

Run this code

#### Artificial data generated on a square

#### Set up a square lattice region
x.easting <- 1:10
x.northing <- 1:10
Grid <- expand.grid(x.easting, x.northing)
n <- nrow(Grid)
t <- 10


#### set up distance and neighbourhood (W, based on sharing a common border) matrices
distance <-array(0, c(n,n))
W <-array(0, c(n,n))
     for(i in 1:n)
     {
          for(j in 1:n)
     	{
		temp <- (Grid[i,1] - Grid[j,1])^2 + (Grid[i,2] - Grid[j,2])^2
		distance[i,j] <- sqrt(temp)
			if(temp==1)  W[i,j] <- 1 
		}	
	}
	
	
#### Generate data
n.all <- n * t
E <- rep(100, n.all)
log.risk <- log(rep(c(rep(1, 70), rep(2, 30)),t))
x <- rnorm(n.all)
risk <- exp(log.risk + 0.1 * x)
mean <- E * risk
Y <- rpois(n=n.all, lambda=mean)
formula <- Y~ offset(log(E)) + x
     

#### Run the model     
model1 <- ST.cluster(formula, K=n, data=NULL, G=4, burnin=5000, n.sample=10000)

Run the code above in your browser using DataLab