SpatialEpi (version 1.1)

besag.newell: Besag-Newell Cluster Detection Method

Description

Besag-Newell cluster detection method. There are differences with the original paper and our implementation:
  • we base our analysis on$k$cases, rather than$k$othercases as prescribed in the paper.
  • we do not subtract 1 from theaccumulated numbers of other casesandaccumulated numbers of others at risk, as was prescribed in the paper to discount selection bias
  • $M$is the total number of areas included, not the number of additional areas included. i.e.$M$starts at 1, not 0.
  • $p$-values are not based on the original value of$k$, rather the actual number of cases observed until we view$k$or more cases. Ex: if$k = 10$, but as we consider neighbors we encounter 1, 2, 9 then 12 cases, we base our$p$-values on$k=12$
  • we do not provide a Monte-Carlo simulated$R$: the number of tests that attain significance at a fixed level$\alpha$
The first two and last differences are because we view the testing on an area-by-area level, rather than a case-by-case level.

Usage

besag.newell(geo, population, cases, expected.cases=NULL, k, alpha.level)

Arguments

geo
an n x 2 table of the (x,y)-coordinates of the area centroids
cases
aggregated case counts for all n areas
population
aggregated population counts for all n areas
expected.cases
expected numbers of disease for all n areas
k
number of cases to consider
alpha.level
$\alpha$-level threshold used to declare significance

Value

  • List containing
  • clustersinformation on all clusters that are $\alpha$-level significant, in decreasing order of the $p$-value
  • p.valuesfor each of the $n$ areas, $p$-values of each cluster of size at least $k$
  • m.valuesfor each of the $n$ areas, the number of areas need to observe at least $k$ cases
  • observed.k.valuesbased on m.values, the actual number of cases used to compute the $p$-values

Details

For the population and cases tables, the rows are bunched by areas first, and then for each area, the counts for each strata are listed. It is important that the tables are balanced: the strata information are in the same order for each area, and counts for each area/strata combination appear exactly once (even if zero).

References

Besag J. and Newell J. (1991) The Detection of Clusters in Rare Diseases Journal of the Royal Statistical Society. Series A (Statistics in Society), 154, 143--155

See Also

pennLC, expected

Examples

Run this code
## Load Pennsylvania Lung Cancer Data
data(pennLC)
data <- pennLC$data

## Process geographical information and convert to grid
geo <- pennLC$geo[,2:3]
geo <- latlong2grid(geo)

## Get aggregated counts of population and cases for each county
population <- tapply(data$population,data$county,sum)
cases <- tapply(data$cases,data$county,sum)

## Based on the 16 strata levels, computed expected numbers of disease
n.strata <- 16
expected.cases <- expected(data$population, data$cases, n.strata)

## Set Parameters
k <- 1250
alpha.level <- 0.05

# not controlling for stratas
results <- besag.newell(geo, population, cases, expected.cases=NULL, k, alpha.level)

# controlling for stratas
results <- besag.newell(geo, population, cases, expected.cases, k, alpha.level)

Run the code above in your browser using DataLab