LSCV.risk: Leave-one-out least-squares cross-validation (LSCV) bandwidths for the relative risk function

Description

Attempts to estimate a jointly optimal, common case-control fixed bandwidth for use in the kernel-smoothed relative risk function via leave-one-out least-squares cross-validation (LSCV). The user can choose between two methods described in Kelsall and Diggle (1995a;b) and Hazelton (2008).

Usage

LSCV.risk(cases, controls, hlim = NULL,
	 method = c("kelsall-diggle", "hazelton"), res = 128, 
	 WIN = NULL, edge = TRUE, comment = TRUE)

Arguments

cases

An object of type data.frame, list, matrix, or ppp descr

controls

As for cases, but for the control observations. Both cases and controls must be of the same object class.

hlim

A numeric vector of length 2 giving the interval over which to search for the common bandwidth which minimises the selection criterion. If NULL (default), the function attempts to automatically select an appropriate range based on multiples o

method

A character vector giving the specific selection criterion to minimise; see either Kelsall and Diggle (1995b) or Hazelton (2008). See `Details'. Defaults to "kelsall-diggle".

res

Single integer giving the square grid resolution over which evaluation of the selection criterion takes place. Defaults to a 128 by 128 grid.

WIN

A polygonal owin object giving the study region. Ignored if data is already a ppp.object.

edge

Boolean. Whether or not to employ edge-correction in the calculations. Defaults to TRUE.

comment

Boolean. Whether or not to print function progress during execution. Defaults to TRUE.

Value

A single numeric value of the estimated bandwidth. The user may need to experiment with adjusting hlim to find a suitable minimum.

Warning

Leave-one-out LSCV for jointly optimal, common bandwidth selection in the kernel-smoothed risk function is even more unstable (in terms of high variability) than the standalone density version. Caution is advised; not all applications will yield a successful result (this is termed ``a breakdown of the methodology'' by Kelsall and Diggle, 1995a). Undersmoothing has been noted in this author's personal experience. This method can also be computationally expensive for large data sets and fine evaluation grid resolutions.

Details

This function calculates a `jointly optimal', common isotropic LSCV bandwidth for the (Gaussian) kernel-smoothed relative risk function (case-control density-ratio). If the cases, controls arguments are data.frame or matrix objects, these must each have exactly two columns containing the x ([,1]) and y ([,2]) data values. Should they be lists, these must have two vector components of equal length named x and y. Alternatively, cases and controls may be objects of class ppp (see ppp.object), and the argument WIN can be ignored. It can be shown that choosing a bandwidth that is equal for both case and control density estimates is preferable to computing `separately optimal' bandwidths (Kelsall and Diggle, 1995a). Setting method = "kelsall-diggle", LSCV.risk computes the common bandwidth which minimises the approximate mean integrated squared error of the log-transformed risk surface (see specifically Kelsall and Diggle, 1995b). Alternatively, the user has the option of computing the common case-control bandwidth which minimises a weighted mean integrated squared error of the (raw) relative risk function (see Hazelton, 2008). Generally, this author has found the Kelsall-Diggle method to provide more stable performance.

References

Kelsall, J.E. and Diggle, P.J. (1995a), Kernel estimation of relative risk, Bernoulli, 1, 3-16. Kelsall, J.E. and Diggle, P.J. (1995b), Non-parametric estimation of spatial variation in relative risk, Statistics in Medicine, 14, 2335-2342. Hazelton, M. L. (2008), Letter to the editor: Kernel estimation of risk surfaces without the need for edge correction, Statistics in Medicine, 27, 2269-2272. Stoyan, D. and Stoyan, H. (1994), Fractals, Random Shapes and Point Fields. Wiley, Great Britain. ISBN 0-471-93757-6.

Examples

Run this code

data(chorley)

LSCV.risk(cases = split(chorley)[[1]], controls = split(chorley)[[2]],
 hlim = c(0.1,2))

Run the code above in your browser using DataLab