roc: Receiver Operating Characteristic

Description

Computes the Receiver Operating Characteristic curve for a point pattern or a fitted point process model.

Usage

roc(X, ...)
# S3 method for ppp
roc(X, covariate, 
                  ...,
                  baseline = NULL, high = TRUE, weights = NULL,
                  observations=c("exact", "presence"),
                  method = "raw",
                  CI = "none", alpha=0.05,
                  subset=NULL)
# S3 method for cdftest
roc(X, ..., high=TRUE)
# S3 method for bermantest
roc(X, ..., high=TRUE)
# S3 method for im
roc(X, covariate, ..., high=TRUE)

Value

Function value table (object of class "fv") which can be plotted to show the ROC curve. Also belongs to class "roc".

Arguments

X: Point pattern (object of class "ppp" or "lpp") or fitted point process model (object of class "ppm" or "kppm" or "lppm") or fitted spatial logistic regression model (object of class "slrm") or some other kind of data.
covariate: Spatial covariate. Either a function(x,y), a pixel image (object of class "im"), or one of the strings "x" or "y" indicating the Cartesian coordinates. Traditionally omitted when X is a fitted model.
...: Arguments passed to as.mask controlling the pixel resolution for calculations.
baseline: Optional. A spatial object giving a baseline intensity. Usually a function(x,y) or a pixel image (object of class "im") giving the baseline intensity at any location within the observation window. Alternatively a point pattern (object of class "ppp") with the locations of the reference population.
high: Logical value indicating whether the threshold operation should favour high or low values of the covariate.
weights: Optional. Numeric vector of weights attached to the data points.
observations: Character string (partially matched) specifying whether to compute the ROC curve using the exact point coordinates (observations="exact", the default) or using the discretised presence-absence data (observations="presence").
method: The method or methods that should be used to estimate the ROC curve. A character vector: current choices are "raw", "monotonic", "smooth" and "all". See Details.
CI: Character string (partially matched) specifying whether confidence intervals should be computed, and for which method. See Details.
alpha: Numeric value between 0 and 1. The confidence intervals will have confidence level 1-alpha. The default gives 95% confidence intervals.
subset: Optional. A spatial window (object of class "owin") specifying a subset of the data, from which the ROC should be calculated.

Author

Adrian Baddeley Adrian.Baddeley@curtin.edu.au, Ege Rubak rubak@math.aau.dk and Suman Rakshit Suman.Rakshit@curtin.edu.au.

Details

This command computes the Receiver Operating Characteristic (ROC) curve. The area under the ROC is computed by auc.

The function roc is generic, with methods for point patterns, fitted point process models, and other kinds of data.

For a point pattern X and a spatial covariate Z, the ROC is a plot showing the ability of the covariate to separate the spatial domain into areas of high and low density of points. For each possible threshold \(z\), the algorithm calculates the fraction \(a(z)\) of area in the study region where the covariate takes a value greater than \(z\), and the fraction \(b(z)\) of data points for which the covariate value is greater than \(z\). The ROC is a plot of \(b(z)\) against \(a(z)\) for all thresholds \(z\). This is called the ‘raw’ ROC curve.

There are currently three methods to estimate the ROC curve:

"raw": uses the raw empirical spatial cummulative distribution function of the covariate.
"monotonic": uses a monotonic regression to estimate the relation between the covariate and the point process intensity and then calculates the ROC from that. This corresponds to a either a convex minorant or a concave majorant of the raw ROC curve.
"smooth": uses a smooth estimate of the relation between the covariate and the point process intensity and then calculates the ROC from that. See roc.rhohat for details.
"all": uses all of the above methods.

If CI is one of the strings 'raw', 'monotonic' or 'smooth', then pointwise 95% confidence intervals for the true ROC curve will be computed based on the raw, monotonic or smooth estimates, respectively. The confidence level is 1-alpha, so that for example alpha=0.01 would give 99% confidence intervals. By default, confidence bands for the ROC curve are not computed.

Some other kinds of objects in spatstat contain sufficient data to compute the ROC curve. These include the objects returned by rhohat, cdf.test and berman.test. Methods are provided here to compute the ROC curve from these objects.

The method for pixel images (objects of class "im") assumes that X represents a density or intensity function, and that the objective is to segregate the spatial region into subregions of high and low total density by thresholding the covariate.

References

Baddeley, A., Rubak, E., Rakshit, S. and Nair, G. (2025) ROC curves for spatial point patterns and presence-absence data. tools:::Rd_expr_doi("10.48550/arXiv.2506.03414")..

Lobo, J.M., Jimenez-Valverde, A. and Real, R. (2007) AUC: a misleading measure of the performance of predictive distribution models. Global Ecology and Biogeography 17(2) 145--151.

Nam, B.-H. and D'Agostino, R. (2002) Discrimination index, the area under the ROC curve. Pages 267--279 in Huber-Carol, C., Balakrishnan, N., Nikulin, M.S. and Mesbah, M., Goodness-of-fit tests and model validity, Birkhauser, Basel.

Examples

Run this code


  gold <- rescale(murchison$gold, 1000, "km")
  faults <- rescale(murchison$faults, 1000, "km")
  dfault <- distfun(faults)

  if(interactive()) {
    plot(roc(gold, dfault, method = "all", high=FALSE))
  } else {
    ## reduce sample resolution to save computation time in test
    plot(roc(gold, dfault, method = "all", high=FALSE, eps=8))
  }

  # Using either an image or reference population as baseline
  cases <- split(chorley)$larynx
  controls <- split(chorley)$lung
  covar <- distfun(as.ppp(chorley.extra$incin, W = Window(chorley)))
  if(interactive()) {
    population <- density(controls, sigma=0.15, eps=0.1)
  } else {
    ## reduce resolution to save computation time in test
    population <- density(controls, sigma=0.3, eps=0.25)
  }
  population <- eval.im(pmax(population, 1e-10))
  roc1 <- roc(cases, covar, baseline = population, high = FALSE, method="all")
  roc2 <- roc(cases, covar, baseline = controls, high = FALSE, method="all")
  plot(anylist(roc1=roc1, roc2=roc2), main = "")

Run the code above in your browser using DataLab