optimal_ubpop
computes statistics for choosing an optimal population
upper bound. ubpop_seq
is a sequence of values to consider as the
optimal choice of upper bound. The smallest value must be at least
min(pop)/sum(pop)
and should generally be less than or equal to 0.5.
optimal_ubpop(
coords,
cases,
pop,
ex = sum(cases)/sum(pop) * pop,
nsim = 499,
alpha = 0.05,
ubpop_seq = seq(0.01, 0.5, len = 50),
longlat = FALSE,
cl = NULL,
type = "poisson",
min.cases = 0,
simdist = "multinomial"
)
Returns a smerc_optimal_ubpop
object. This includes:
The sequence of population bounds considered
An object with statistics related to the elbow method
An object with statistics related to the gini method
The population upperbound suggested by the elbow method
The population upperbound suggested by the Gini method
An \(n \times 2\) matrix of centroid coordinates for the regions in the form (x, y) or (longitude, latitude) is using great circle distance.
The number of cases observed in each region.
The population size associated with each region.
The expected number of cases for each region. The default is calculated under the constant risk hypothesis.
The number of simulations from which to compute the p-value.
The significance level to determine whether a cluster is signficant. Default is 0.10.
A strictly increasing numeric vector with values between
min(pop)/sum(pop) and 1. The default is seq(0.01, 0.5, len = 50)
.
The default is FALSE
, which
specifies that Euclidean distance should be used. If
longlat
is TRUE
, then the great circle
distance is used to calculate the intercentroid
distance.
A cluster object created by makeCluster
,
or an integer to indicate number of child-processes
(integer values are ignored on Windows) for parallel evaluations
(see Details on performance).
It can also be "future"
to use a future backend (see Details),
NULL
(default) refers to sequential evaluation.
The type of scan statistic to compute. The
default is "poisson"
. The other choice
is "binomial"
.
The minimum number of cases required for a cluster. The default is 2.
Character string indicating the simulation
distribution. The default is "multinomial"
, which
conditions on the total number of cases observed. The
other options are "poisson"
and "binomial"
Joshua French
Meysami, Mohammad, French, Joshua P., and Lipner, Ettie M. The estimation of the optimal cluster upper bound for scan methods in retrospective disease surveillance. Submitted.
Han, J., Zhu, L., Kulldorff, M. et al. Using Gini coefficient to determining optimal cluster reporting sizes for spatial scan statistics. Int J Health Geogr 15, 27 (2016). <doi:10.1186/s12942-016-0056-6>
scan.test
data(nydf)
coords <- with(nydf, cbind(longitude, latitude))
ubpop_stats <- optimal_ubpop(
coords = coords, cases = nydf$cases,
pop = nydf$pop, nsim = 49,
ubpop_seq = seq(0.05, 0.5, by = 0.05)
)
ubpop_stats
if (FALSE) {
plot(ubpop_stats)
}
Run the code above in your browser using DataLab