relrisk.ppp: Nonparametric Estimate of Spatially-Varying Relative Risk

Description

Given a multitype point pattern, this function estimates the spatially-varying probability of each type of point, or the ratios of such probabilities, using kernel smoothing. The default smoothing bandwidth is selected by cross-validation.

Usage

# S3 method for ppp
relrisk(X, sigma = NULL, ...,
           at = c("pixels", "points"),
           weights = NULL, varcov = NULL, 
           relative=FALSE, normalise=FALSE,
           adjust=1, edge=TRUE, diggle=FALSE,
           se=FALSE, wtype=c("value", "multiplicity"),
           casecontrol=TRUE, control=1, case, shrink=0, fudge=0)

Arguments

Value

If se=FALSE (the default), the format is described below. If se=TRUE, the result is a list of two entries,

estimate and SE, each having the format described below.

If X consists of only two types of points, and if casecontrol=TRUE, the result is a pixel image (if at="pixels") or a vector (if at="points"). The pixel values or vector values are the probabilities of a case if relative=FALSE, or the relative risk of a case (probability of a case divided by the probability of a control) if relative=TRUE.

If X consists of more than two types of points, or if casecontrol=FALSE, the result is:

(if at="pixels") a list of pixel images, with one image for each possible type of point. The result also belongs to the class "solist" so that it can be printed and plotted.
(if at="points") a matrix of probabilities, with rows corresponding to data points \(x_i\), and columns corresponding to types \(j\).

The pixel values or matrix entries are the probabilities of each type of point if relative=FALSE, or the relative risk of each type (probability of each type divided by the probability of a control) if relative=TRUE.

If relative=FALSE, the resulting values always lie between 0 and 1. If relative=TRUE, the results are either non-negative numbers, or the values Inf or NA.

Details

The command relrisk is generic and can be used to estimate relative risk in different ways.

This function relrisk.ppp is the method for point pattern datasets. It computes nonparametric estimates of relative risk by kernel smoothing (Bithell, 1990, 1991; Diggle, 2003; Baddeley, Rubak and Turner, 2015).

If X is a bivariate point pattern (a multitype point pattern consisting of two types of points) then by default, the points of the first type (the first level of marks(X)) are treated as controls or non-events, and points of the second type are treated as cases or events. Then by default this command computes the spatially-varying probability of a case, i.e. the probability \(p(u)\) that a point at spatial location \(u\) will be a case. If relative=TRUE, it computes the spatially-varying relative risk of a case relative to a control, \(r(u) = p(u)/(1- p(u))\).

If X is a multitype point pattern with \(m > 2\) types, or if X is a bivariate point pattern and casecontrol=FALSE, then by default this command computes, for each type \(j\), a nonparametric estimate of the spatially-varying probability of an event of type \(j\). This is the probability \(p_j(u)\) that a point at spatial location \(u\) will belong to type \(j\). If relative=TRUE, the command computes the relative risk of an event of type \(j\) relative to a control, \(r_j(u) = p_j(u)/p_k(u)\), where events of type \(k\) are treated as controls. The argument control determines which type \(k\) is treated as a control.

If at = "pixels" the calculation is performed for every spatial location \(u\) on a fine pixel grid, and the result is a pixel image representing the function \(p(u)\) or a list of pixel images representing the functions \(p_j(u)\) or \(r_j(u)\) for \(j = 1,\ldots,m\). An infinite value of relative risk (arising because the probability of a control is zero) will be returned as NA.

If at = "points" the calculation is performed only at the data points \(x_i\). By default the result is a vector of values \(p(x_i)\) giving the estimated probability of a case at each data point, or a matrix of values \(p_j(x_i)\) giving the estimated probability of each possible type \(j\) at each data point. If relative=TRUE then the relative risks \(r(x_i)\) or \(r_j(x_i)\) are returned. An infinite value of relative risk (arising because the probability of a control is zero) will be returned as Inf.

Estimation is performed by a simple Nadaraja-Watson type kernel smoother (Bithell, 1990, 1991; Diggle, 2003; Baddeley, Rubak and Turner, 2015, section 14.4). The smoothing bandwidth can be specified in any of the following ways:

sigma is a single numeric value, giving the standard deviation of the isotropic Gaussian kernel.
sigma is a numeric vector of length 2, giving the standard deviations in the \(x\) and \(y\) directions of a Gaussian kernel.
varcov is a 2 by 2 matrix giving the variance-covariance matrix of the Gaussian kernel.
sigma is a function which selects the bandwidth. Bandwidth selection will be applied separately to each type of point. An example of such a function is bw.diggle.
sigma and varcov are both missing or null. Then a common smoothing bandwidth sigma will be selected by cross-validation using bw.relrisk.
An infinite smoothing bandwidth, sigma=Inf, is permitted and yields a constant estimate of relative risk.

If se=TRUE then standard errors will also be computed, based on asymptotic theory, assuming a Poisson process.

The optional argument weights may provide numerical weights for the points of X. It should be a numeric vector of length equal to npoints(X).

The argument weights can also be an expression. It will be evaluated in the data frame as.data.frame(X) to obtain a vector of weights. The expression may involve the symbols x and y representing the Cartesian coordinates, and the symbol marks representing the mark values.

The argument weights can also be a pixel image (object of class "im"). numerical weights for the data points will be extracted from this image (by looking up the pixel values at the locations of the data points in X).

References

Baddeley, A., Rubak, E. and Turner, R. (2015) Spatial Point Patterns: Methodology and Applications with R. Chapman and Hall/CRC Press.

Bithell, J.F. (1990) An application of density estimation to geographical epidemiology. Statistics in Medicine 9, 691--701.

Bithell, J.F. (1991) Estimation of relative risk functions. Statistics in Medicine 10, 1745--1751.

Diggle, P.J. (2003) Statistical analysis of spatial point patterns, Second edition. Arnold.

Diggle, P.J., Zheng, P. and Durr, P. (2005) Non-parametric estimation of spatial segregation in a multivariate point process: bovine tuberculosis in Cornwall, UK. Applied Statistics 54, 645--658.

Examples

Run this code

   p.oak <- relrisk(urkiola, 20)
   if(interactive()) {
      plot(p.oak, main="proportion of oak")
      plot(eval.im(p.oak > 0.3), main="More than 30 percent oak")
      plot(split(lansing), main="Lansing Woods")
      p.lan <- relrisk(lansing, 0.05, se=TRUE)
      plot(p.lan$estimate, main="Lansing Woods species probability")
      plot(p.lan$SE, main="Lansing Woods standard error")
      wh <- im.apply(p.lan$estimate, which.max)
      types <- levels(marks(lansing))
      wh <- eval.im(types[wh])
      plot(wh, main="Most common species")
   }

Run the code above in your browser using DataLab