kstest.ppm: Kolmogorov-Smirnov Test for Point Pattern or Point Process Model

Description

Performs a Kolmogorov-Smirnov test of goodness-of-fit of a Poisson point process model. The test compares the observed and predicted distributions of the values of a spatial covariate.

Usage

kstest(...)
## S3 method for class 'ppp':
kstest(X, covariate, ..., jitter=TRUE)
## S3 method for class 'ppm':
kstest(model, covariate, ..., jitter=TRUE)
## S3 method for class 'lpp':
kstest(X, covariate, ..., jitter=TRUE)
## S3 method for class 'lppm':
kstest(model, covariate, ..., jitter=TRUE)
## S3 method for class 'slrm':
kstest(model, covariate, ..., modelname=NULL, covname=NULL)

Arguments

A point pattern (object of class "ppp" or "lpp").

model

A fitted point process model (object of class "ppm" or "lppm") or fitted spatial logistic regression (object of class "slrm").

covariate

The spatial covariate on which the test will be based. A function, a pixel image (object of class "im"), a list of pixel images, or one of the characters "x" or "y" indicating the Cartesian coordinates.

...

Arguments passed to ks.test to control the test.

jitter

Logical flag. If jitter=TRUE, values of the covariate will be slightly perturbed at random, to avoid tied values in the test.

modelname,covname

Character strings giving alternative names for model and covariate to be used in labelling plot axes.

Value

An object of class "htest" containing the results of the test. See ks.test for details. The return value can be printed to give an informative summary of the test.
The value also belongs to the class "kstest" for which there is a plot method.

Warning

The outcome of the test involves a small amount of random variability, because (by default) the coordinates are randomly perturbed to avoid tied values. Hence, if kstest is executed twice, the $p$-values will not be exactly the same. To avoid this behaviour, set jitter=FALSE.

Details

These functions perform a goodness-of-fit test of a Poisson point process model fitted to point pattern data. The observed distribution of the values of a spatial covariate at the data points, and the predicted distribution of the same values under the model, are compared using the Kolmogorov-Smirnov test.

The function kstest is generic, with methods for point patterns ("ppp" or "lpp"), point process models ("ppm" or "lppm") and spatial logistic regression models ("slrm").

IfXis a point pattern dataset (object of class"ppp"), thenkstest(X, ...)performs a goodness-of-fit test of the uniform Poisson point process (Complete Spatial Randomness, CSR) for this dataset. For a multitype point pattern, the uniform intensity is assumed to depend on the type of point (sometimes called Complete Spatial Randomness and Independence, CSRI).
Ifmodelis a fitted point process model (object of class"ppm"or"lppm") thenkstest(model, ...)performs a test of goodness-of-fit for this fitted model. In this case,modelshould be a Poisson point process.
Ifmodelis a fitted spatial logistic regression (object of class"slrm") thenkstest(model, ...)performs a test of goodness-of-fit for this fitted model.

The test is performed by comparing the observed distribution of the values of a spatial covariate at the data points, and the predicted distribution of the same covariate under the model, using the classical Kolmogorov-Smirnov test. Thus, you must nominate a spatial covariate for this test. If X is a point pattern that does not have marks, the argument covariate should be either a function(x,y) or a pixel image (object of class "im" containing the values of a spatial function, or one of the characters "x" or "y" indicating the Cartesian coordinates. If covariate is an image, it should have numeric values, and its domain should cover the observation window of the model. If covariate is a function, it should expect two arguments x and y which are vectors of coordinates, and it should return a numeric vector of the same length as x and y. If X is a multitype point pattern, the argument covariate can be either a function(x,y,marks), or a pixel image, or a list of pixel images corresponding to each possible mark value, or one of the characters "x" or "y" indicating the Cartesian coordinates. First the original data point pattern is extracted from model. The values of the covariate at these data points are collected.

The predicted distribution of the values of the covariate under the fitted model is computed as follows. The values of the covariate at all locations in the observation window are evaluated, weighted according to the point process intensity of the fitted model, and compiled into a cumulative distribution function $F$ using ewcdf.

The probability integral transformation is then applied: the values of the covariate at the original data points are transformed by the predicted cumulative distribution function $F$ into numbers between 0 and 1. If the model is correct, these numbers are i.i.d. uniform random numbers. The Kolmogorov-Smirnov test of uniformity is applied using ks.test.

This test was apparently first described (in the context of spatial data) by Berman (1986). See also Baddeley et al (2005).

The return value is an object of class "htest" containing the results of the hypothesis test. The print method for this class gives an informative summary of the test outcome.

The return value also belongs to the class "kstest" for which there is a plot method plot.kstest. The plot method displays the empirical cumulative distribution function of the covariate at the data points, and the predicted cumulative distribution function of the covariate under the model, plotted against the value of the covariate.

The argument jitter controls whether covariate values are randomly perturbed, in order to avoid ties. If the original data contains any ties in the covariate (i.e. points with equal values of the covariate), and if jitter=FALSE, then the Kolmogorov-Smirnov test implemented in ks.test will issue a warning that it cannot calculate the exact $p$-value. To avoid this, if jitter=TRUE each value of the covariate will be perturbed by adding a small random value. The perturbations are normally distributed with standard deviation equal to one hundredth of the range of values of the covariate. This prevents ties, and the $p$-value is still correct. There is a very slight loss of power.

References

Baddeley, A., Turner, R., Moller, J. and Hazelton, M. (2005) Residual analysis for spatial point processes. Journal of the Royal Statistical Society, Series B 67, 617--666.

Berman, M. (1986) Testing for spatial association between a point process and another stochastic process. Applied Statistics 35, 54--62.

Examples

Run this code

# test of CSR using x coordinate
   kstest(nztrees, "x")

   # test of CSR using a function of x and y
   fun <- function(x,y){2* x + y}
   kstest(nztrees, fun)

   # test of CSR using an image covariate
   funimage <- as.im(fun, W=as.owin(nztrees))
   kstest(nztrees, funimage)

   # fit inhomogeneous Poisson model and test
   model <- ppm(nztrees, ~x)
   kstest(model, "x")

   if(interactive()) {
     # synthetic data: nonuniform Poisson process
     X <- rpoispp(function(x,y) { 100 * exp(x) }, win=square(1))

     # fit uniform Poisson process
     fit0 <- ppm(X, ~1)
     # fit correct nonuniform Poisson process
     fit1 <- ppm(X, ~x)

     # test wrong model
     kstest(fit0, "x")
     # test right model
     kstest(fit1, "x")
   }

   # multitype point pattern
   kstest(amacrine, "x")
   yimage <- as.im(function(x,y){y}, W=as.owin(amacrine))
   kstest(ppm(amacrine, ~marks+y), yimage)

Run the code above in your browser using DataLab