predict.ppm: Prediction from a Fitted Point Process Model

Description

Given a fitted point process model obtained by ppm, evaluate the intensity, conditional intensity or spatial trend of the model at new locations.

Usage

# S3 method for ppm
predict(object, window=NULL, ngrid=NULL, locations=NULL,
   covariates=NULL,
   type=c("trend", "cif", "intensity", "count"),
   se=FALSE,
   interval=c("none", "confidence", "prediction"),
   level = 0.95, 
   X=data.ppm(object), correction, ignore.hardcore=FALSE,
   ...,
   dimyx=NULL, eps=NULL,
   rule.eps = c("adjust.eps", "grow.frame", "shrink.frame"),
   new.coef=NULL, check=TRUE, repair=TRUE)

Value

If total is given:

a numeric vector or matrix.

If locations is given and is a data frame:

a vector of predicted values for the spatial locations (and marks, if required) given in locations.

If ngrid is given, or if locations is given and is a binary image mask or a pixel image:

If object is an unmarked point process, the result is a pixel image object (of class "im", see

im.object) containing the predictions. If object is a multitype point process, the result is a list of pixel images, containing the predictions for each type at the same grid of locations.

The ``predicted values'' are either values of the spatial trend (if type="trend"), values of the conditional intensity (if type="cif" or type="lambda"), values of the intensity (if type="intensity") or numbers of points (if type="count").

If se=TRUE, then the result is a list with two entries, the first being the predicted values in the format described above, and the second being the standard errors in the same format.

Arguments

object: A fitted point process model, typically obtained from the model-fitting algorithm ppm. An object of class "ppm" (see ppm.object).
window: Optional. A window (object of class "owin") delimiting the locations where predictions should be computed. Defaults to the window of the original data used to fit the model object.
ngrid: Optional. Dimensions of a rectangular grid of locations inside window where the predictions should be computed. An integer, or an integer vector of length 2, specifying the number of grid points in the $y$ and $x$ directions. (Incompatible with locations. Equivalent to dimyx.)
locations: Optional. Data giving the exact $x,y$ coordinates (and marks, if required) of locations at which predictions should be computed. Either a point pattern, or a data frame with columns named x and y, or a binary image mask, or a pixel image. (Incompatible with ngrid, dimyx and eps).
covariates: Values of external covariates required by the model. Either a data frame or a list of images. See Details.
type: Character string. Indicates which property of the fitted model should be predicted. Options are "trend" for the spatial trend, "cif" or "lambda" for the conditional intensity, "intensity" for the intensity, and "count" for the total number of points in window.
se: Logical value indicating whether to calculate standard errors as well.
interval: String (partially matched) indicating whether to produce estimates (interval="none", the default) or a confidence interval (interval="confidence") or a prediction interval (interval="prediction").
level: Coverage probability for the confidence or prediction interval.
X: Optional. A point pattern (object of class "ppp") to be taken as the data point pattern when calculating the conditional intensity. The default is to use the original data to which the model was fitted.
correction: Name of the edge correction to be used in calculating the conditional intensity. Options include "border" and "none". Other options may include "periodic", "isotropic" and "translate" depending on the model. The default correction is the one that was used to fit object.
ignore.hardcore: Advanced use only. Logical value specifying whether to compute only the finite part of the interaction potential (effectively removing any hard core interaction terms).
...: Ignored.
dimyx: Equivalent to ngrid.
eps: Width and height of pixels in the prediction grid. A numerical value, or numeric vector of length 2.
rule.eps: Argument passed to as.mask controlling the discretisation. See as.mask.
new.coef: Numeric vector of parameter values to replace the fitted model parameters coef(object).
check: Logical value indicating whether to check the internal format of object. If there is any possibility that this object has been restored from a dump file, or has otherwise lost track of the environment where it was originally computed, set check=TRUE.
repair: Logical value indicating whether to repair the internal format of object, if it is found to be damaged.

Intensity, Conditional Intensity and Trend

The point process models fitted by ppm are either Poisson point process models with a specified form of the intensity function, or Gibbs point process models with a specified form of the conditional intensity function.

The intensity $\lambda(u)$ of a point process at a location $u$ is the expected number of points of the process per unit area in the vicinity of the location $u$.

For a Poisson point process model fitted by ppm, the intensity function is of the form $$ \lambda(u) = \exp(\theta^\top Z(u)) $$ where $Z(u)$ is a specified function of spatial location (possibly vector valued) and $\theta$ is a parameter or parameter vector. Fitting the model is equivalent to estimating the parameters $\theta$. Given a fitted Poisson model, the intensity function can be calculated from the equation above.

For a Gibbs point process model, this approach is not feasible because the intensity function is not known analytically as a function of the model parameters. Instead it is more effective to work with the conditional intensity. The conditional intensity $\lambda(u\mid X)$ is the expected number of points per unit area in the vicinity of $u$, given that the rest of the point process is X.

For a Gibbs point process model fitted by ppm, the conditional intensity function is of the form $$ \lambda(u\mid X) = \exp(\theta^\top Z(u) + \phi^\top S(u\mid X) $$ where $S(u\mid X)$ is a specified function and $\phi$ is a parameter vector. The parameters $\phi$ control the interaction (stochastic dependence) between points in the process, while the parameters $\theta$ control the spatial inhomogeneity of the process. Fitting the model is equivalent to estimating the parameter vectors $\theta$ and $\phi$. Given a fitted Gibbs model, the conditional intensity function can be calculated from the equation above. When type="cif" the conditional intensity is calculated at every location $u$ in a pixel grid, using the observed point pattern X.

For a Gibbs point process model with conditional intensity $\lambda(u \mid X)$ as described above, the ‘spatial trend’ is $$ \beta(u) = \exp(\theta^\top Z(u)) $$ The spatial trend ignores the interaction between points. When type="trend", this function is calculated at every location $u$ in a pixel grid.

For a Gibbs point process model, the intensity function is not known analytically, but it is possible to approximate the intensity function (Baddeley and Nair, 2012). When type="intensity", this approximation is computed at every location $u$ in a pixel grid.

For a Poisson process, the intensity, conditional intensity and spatial trend are all identical.

For more information, see Sections 13.3.3, 13.3.6 and 13.12.6 of Baddeley, Rubak and Turner (2015).

Warnings

The current implementation invokes predict.glm so that prediction is wrong if the trend formula in object involves terms in ns(), bs() or poly(). This is a weakness of predict.glm itself!

Error messages may be very opaque, as they tend to come from deep in the workings of predict.glm. If you are passing the covariates argument and the function crashes, it is advisable to start by checking that all the conditions listed above are satisfied.

Author

Adrian Baddeley Adrian.Baddeley@curtin.edu.au and Rolf Turner rolfturner@posteo.net

Details

This function computes properties of a fitted spatial point process model (object of class "ppm"). For a Poisson point process it can compute the fitted intensity function, or the expected number of points in a region. For a Gibbs point process it can compute the spatial trend (first order potential), conditional intensity, and approximate intensity of the process. Point estimates, standard errors, confidence intervals and prediction intervals are available.

Given a point pattern dataset, we may fit a point process model to the data using the model-fitting algorithm ppm. This returns an object of class "ppm" representing the fitted point process model (see ppm.object). The parameter estimates in this fitted model can be read off simply by printing the ppm object. The spatial trend, conditional intensity and intensity of the fitted model are evaluated using this function predict.ppm.

The default action is to create a rectangular grid of points in the observation window of the data point pattern, and evaluate the spatial trend at these locations.

The argument type specifies the values that are desired:

If type="trend":: the ``spatial trend'' of the fitted model is evaluated at each required spatial location $u$. See below.
If type="cif":: the conditional intensity $\lambda(u, X)$ of the fitted model is evaluated at each required spatial location $u$, with respect to the data point pattern $X$.
If type="intensity":: the intensity $\lambda(u)$ of the fitted model is evaluated at each required spatial location $u$.
If type="count":: the expected total number of points (or the expected number of points falling in window) is evaluated. If window is a tessellation, the expected number of points in each tile of the tessellation is evaluated.

The spatial trend, conditional intensity, and intensity are all equivalent if the fitted model is a Poisson point process, but they are different if the model is not a Poisson process. See the section on Intensity, Conditional Intensity and Trend below.

The default is to compute an estimate of the desired quantity. If interval="confidence" or interval="prediction", the estimate is replaced by a confidence interval or prediction interval.

If se=TRUE, then a standard error is also calculated, and is returned together with the (point or interval) estimate.

The spatial locations where predictions are required, are determined by the (incompatible) arguments ngrid and locations.

If the argument ngrid is present, then predictions are performed at a rectangular grid of locations in the window window. The result of prediction will be a pixel image or images.
If locations is present, then predictions will be performed at the spatial locations given by this dataset. These may be an arbitrary list of spatial locations, or they may be a rectangular grid. The result of prediction will be either a numeric vector or a pixel image or images.
If neither ngrid nor locations is given, then ngrid is assumed. The value of ngrid defaults to spatstat.options("npixel"), which is initialised to 128 when spatstat is loaded.

The argument locations may be a point pattern, a data frame or a list specifying arbitrary locations; or it may be a binary image mask (an object of class "owin" with type "mask") or a pixel image (object of class "im") specifying (a subset of) a rectangular grid of locations.

If locations is a point pattern (object of class "ppp"), then prediction will be performed at the points of the point pattern. The result of prediction will be a vector of predicted values, one value for each point. If the model is a marked point process, then locations should be a marked point pattern, with marks of the same kind as the model; prediction will be performed at these marked points. The result of prediction will be a vector of predicted values, one value for each (marked) point.
If locations is a data frame or list, then it must contain vectors locations$x and locations$y specifying the $x,y$ coordinates of the prediction locations. Additionally, if the model is a marked point process, then locations must also contain a factor locations$marks specifying the marks of the prediction locations. These vectors must have equal length. The result of prediction will be a vector of predicted values, of the same length.
If locations is a binary image mask, then prediction will be performed at each pixel in this binary image where the pixel value is TRUE (in other words, at each pixel that is inside the window). If the fitted model is an unmarked point process, then the result of prediction will be an image. If the fitted model is a marked point process, then prediction will be performed for each possible value of the mark at each such location, and the result of prediction will be a list of images, one for each mark value.
If locations is a pixel image (object of class "im"), then prediction will be performed at each pixel in this image where the pixel value is defined (i.e.\ where the pixel value is not NA).

The argument covariates gives the values of any spatial covariates at the prediction locations. If the trend formula in the fitted model involves spatial covariates (other than the Cartesian coordinates x, y) then covariates is required. The format and use of covariates are analogous to those of the argument of the same name in ppm. It is either a data frame or a list of images.

If covariates is a list of images, then the names of the entries should correspond to the names of covariates in the model formula trend. Each entry in the list must be an image object (of class "im", see im.object). The software will look up the pixel values of each image at the quadrature points.
If covariates is a data frame, then the ith row of covariates is assumed to contain covariate data for the ith location. When locations is a data frame, this just means that each row of covariates contains the covariate data for the location specified in the corresponding row of locations. When locations is a binary image mask, the row covariates[i,] must correspond to the location x[i],y[i] where x = as.vector(raster.x(locations)) and y = as.vector(raster.y(locations)).

Note that if you only want to use prediction in order to generate a plot of the predicted values, it may be easier to use plot.ppm which calls this function and plots the results.

References

Baddeley, A. and Nair, G. (2012) Fast approximation of the intensity of Gibbs point processes. Electronic Journal of Statistics 6 1155--1169.

Baddeley, A., Rubak, E. and Turner, R. (2015) Spatial Point Patterns: Methodology and Applications with R. Chapman and Hall/CRC Press.

Examples

Run this code

  op <- spatstat.options(npixel=32)
  m <- ppm(cells ~ polynom(x,y,2), Strauss(0.05))
  trend <- predict(m, type="trend")
  if(human <- interactive()) {
  image(trend)
  points(cells)
  }
  cif <- predict(m, type="cif")
  if(human) {
  persp(cif)
  }
  mj <- ppm(japanesepines ~ harmonic(x,y,2))
  se <- predict(mj, se=TRUE) # image of standard error
  ci <- predict(mj, interval="c") # two images, confidence interval

  # prediction interval for total number of points
  predict(mj, type="count", interval="p")

  # prediction intervals for counts in tiles
  predict(mj, window=quadrats(japanesepines, 3), type="count", interval="p")

  # prediction at arbitrary locations
  predict(mj, locations=data.frame(x=0.3, y=0.4))

  X <- runifpoint(5, Window(japanesepines))
  predict(mj, locations=X, se=TRUE)

  # multitype
  rr <- matrix(0.06, 2, 2)
  ma <- ppm(amacrine ~ marks,  MultiStrauss(rr))
  Z <- predict(ma)
  Z <- predict(ma, type="cif")
  predict(ma, locations=data.frame(x=0.8, y=0.5,marks="on"), type="cif")

  spatstat.options(op)

Run the code above in your browser using DataLab