Calculates a p-value surface based on asymptotic theory or Monte-Carlo (MC) permutations describing the extremity of risk given a fixed or adaptive kernel-smoothed density-ratio, allowing the drawing of tolerance contours.
tolerance(rs, method = c("ASY", "MC"), ref.density = NULL, beta = 0.025,
ITER = 100, parallelise = NULL, verbose = TRUE, ...)
An object of class rrs
giving the estimated relative
risk function for which to calculate the p-value surface.
A character string specifying the method of calculation.
"ASY"
(default) instructs the function to compute the p-values
using asymptotic theory. "MC"
computes the values by random
permutations of the data. See `Details'.
Required if rs
is based on fixed-bandwidth
estimates of the case and control densities and method = "ASY"
.
Either a pixel im
age or an object of class
bivden
giving the reference density to use in asymptotic
formulae. May be unnormalised. Ignored if rs
is based on adaptive
kernel estimates or if method = "MC"
.
A numeric value \(0 <\) beta
\(< 1\) giving the
fineness of the adaptive bandwidth partitioning to use for calculation of
the required quantities for asymptotic adaptive p-value surfaces.
Smaller values provide more accurate bandwidth bins at the cost of
additional computing time, see Davies and Baddeley (2017); the default is
sensible in most cases. Ignored if rs
is based on fixed-bandwidth
kernel estimates.
Number of iterations for the Monte-Carlo permutations. Ignored
if method = "ASY"
.
Numeric argument to invoke parallel processing, giving
the number of CPU cores to use when method = "MC"
. Experimental. Test
your system first using parallel::detectCores()
to identify the
number of cores available to you.
Logical value indicating whether to print function progress during execution.
A pixel im
age of the estimated
p-value surface.
This function implements developments in Hazelton and Davies (2009) (fixed) and Davies and Hazelton (2010) (adaptive) to compute pointwise p-value surfaces based on asymptotic theory of kernel-smoothed relative risk surfaces. Alternatively, the user may elect to calculate the p-value surfaces using Monte-Carlo methods (see Kelsall and Diggle, 1995). Superimposition upon a plot of the risk surface contours of these p-values at given significance levels (i.e. ``tolerance contours'') can be an informative way of exploring the statistical significance of the extremity of risk across the defined study region.
The Monte-Carlo method, while not dependent on asymptotic theory, is
computationally expensive and it has been suggested might have some
undesirable practical consequences in certain settings (Hazelton and Davies,
2009). When performing the MC simulations, the same global (and pilot, if
necessary) bandwidths and edge-correction regimen is employed as were used
in the initial density estimates of the observed data. With regard to
arguments to be passed to internal calls of risk
, the user
should take care to use ...
to set the epsilon
value to match
that which was used in creation of the object passed to rs
(if this
was set to a non-default value). Furthermore, if performing MC simulations
for the adaptive relative risk function, the function borrows the value of
the beta
argument to speed things up via partitioning, and the user
should additionally access ...
to set the same pilot.symmetry
value as was used for creation of the object passed to rs
, in the
same way as for any non-default use of epsilon
. This will ensure the
simulations are all performed under the same conditions as the original risk
function.
The asymptotic approach to the p-value calculation can be advantageous over a Monte-Carlo method, which can lead to excessive computation time for adaptive risk surfaces and large datasets. See the aforementioned references for further comments.
Choosing different options for the argument test
simply manipulates
the `direction' of the p-values. That is, plotting tolerance contours at a
significance level of 0.05 for a p-value surface calculated with test
= "double"
is equivalent to plotting tolerance contours at significance
levels of 0.025 and 0.975 for test = "upper"
.
Implementation of the Monte-Carlo contours for the fixed-bandwidth
estimator simply involves random allocation of case/control marks and
re-estimation of the risk surface ITER
times, against which the
original estimate is compared. The bandwidth(s) for case and control
densities in the permuted estimates are controlled by hpsim
. If your
risk surface is adaptive, hpsim
is used to control the pilot
bandwidths, h0sim
the global bandwidths. In particular, for the
adaptive symmetric estimator (Davies et al. 201), it is assumed that the
original estimate was itself calculated as a symmetric estimate via use of
the pdef
argument. The symchoice
argument governs the specific
permuted data set to use for the pilot estimate, and if hpsim
is
NULL
, the pilot bandwidth thereof is taken from the stored
pdef
object in the original estimate. An error will occur if you
attempt to set symchoice
with an rs
argument in this function
that does not contain density estimates with present pdef
objects of
class "bivden"
. See the help file for bivariate.density
for details on using the pdef
argument.
In addition to the usage noted above, you may define either hpsim
and/or h0sim
as functions which re-calculate the case and control
pilot (or fixed) bandwidth(s) and the global bandwidth(s) at each iteration,
based on the data set of the permuted case/control marks. If so, these must
strictly be functions that take the case data as the first argument and the
control data as the second argument, each as a two-column matrix of the x-y
coordinates. The function must strictly return a numeric vector of length 1
or 2; these entries to be assigned to the relevant density estimates as per
the usage notes on supply of a numeric vector for hpsim
. Take care --
warnings will be issued if, for example, you specify a hpsim
function
that returns two numeric values, but your rs
object is a
symmetric-adaptive estimate (in which case it only makes sense to yield one
pilot bandwidth)!
Davies, T.M. and Baddeley A. (2017), Fast computation of spatially adaptive kernel estimates, Statistics and Computing, [to appear].
Davies, T.M. and Hazelton, M.L. (2010), Adaptive kernel estimation of spatial relative risk, Statistics in Medicine, 29(23) 2423-2437.
Hazelton, M.L. and Davies, T.M. (2009), Inference based on kernel estimates of the relative risk function in geographical epidemiology, Biometrical Journal, 51(1), 98-109.
Kelsall, J.E. and Diggle, P.J. (1995), Kernel estimation of relative risk, Bernoulli, 1, 3-16.
# NOT RUN {
# }
# NOT RUN {
data(pbc)
h0 <- LSCV.risk(pbc,method="hazelton");h0
pbccas <- split(pbc)[[1]]
pbccon <- split(pbc)[[2]]
# ASY
riskfix <- risk(pbc,h0=h0)
fixtol1 <- tolerance(riskfix,ref.density=density(pbc,OS(pbc)))
riskada <- risk(pbc,h0=h0,adapt=TRUE,hp=NS(pbc),pilot.symmetry="pooled",davies.baddeley=0.025)
adatol1 <- tolerance(riskada)
par(mfrow=c(1,2))
plot(riskfix)
tol.contour(fixtol1,levels=c(0.1,0.05,0.01),lty=3:1,add=TRUE)
plot(riskada)
tol.contour(adatol1,levels=c(0.1,0.05,0.01),lty=3:1,add=TRUE)
# MC
fixtol2 <- tolerance(riskfix,method="MC",ITER=200)
adatol2 <- tolerance(riskada,method="MC",ITER=200,parallelise=4) # ~1 minute with parallelisation
par(mfrow=c(1,2))
plot(riskfix)
tol.contour(fixtol2,levels=c(0.1,0.05,0.01),lty=3:1,add=TRUE)
plot(riskada)
tol.contour(adatol2,levels=c(0.1,0.05,0.01),lty=3:1,add=TRUE)
# }
# NOT RUN {
# }
Run the code above in your browser using DataLab