lurking: Lurking variable plot

Description

Plot spatial point process residuals against a covariate

Usage

lurking(object, covariate, type="eem",
                    cumulative=TRUE,
                    clipwindow=default.clipwindow(object),
                    rv,
                    plot.sd, plot.it=TRUE,
                    typename,
                    covname,
                    oldstyle=FALSE, check=TRUE,
                    ...,
                    splineargs=list(spar=0.5))

Arguments

object

The fitted point process model (an object of class "ppm") for which diagnostics should be produced. This object is usually obtained from ppm. Alternatively, object may be a

covariate

The covariate against which residuals should be plotted. Either a numeric vector, a pixel image, or an expression. See Details below.

type

String indicating the type of residuals or weights to be computed. Choices include "eem", "raw", "inverse" and "pearson". See diagnose.ppm

cumulative

Logical flag indicating whether to plot a cumulative sum of marks (cumulative=TRUE) or the derivative of this sum, a marginal density of the smoothed residual field (cumulative=FALSE).

clipwindow

If not NULL this argument indicates that residuals shall only be computed inside a subregion of the window containing the original point pattern data. Then clipwindow should be a window object of class "owin"

Usually absent. If this argument is present, the values of the residuals will not be calculated from the fitted model object but will instead be taken directly from this vector.

plot.sd

Logical value indicating whether error bounds should be added to plot. The default is TRUE for Poisson models and FALSE for non-Poisson models. See Details.

plot.it

Logical value indicating whether plots should be shown. If plot.it=FALSE, only the computed coordinates for the plots are returned. See Value.

typename

Usually absent. If this argument is present, it should be a string, and will be used (in the axis labels of plots) to describe the type of residuals.

covname

A string name for the covariate, to be used in axis labels of plots.

oldstyle

Logical flag indicating whether error bounds should be plotted using the approximation given in the original paper (oldstyle=TRUE), or using the correct asymptotic formula (oldstyle=FALSE).

check

Logical flag indicating whether the integrity of the data structure in object should be checked.

...

Arguments passed to plot.default and lines to control the plot behaviour.

splineargs

A list of arguments passed to smooth.spline for the estimation of the derivatives in the case cumulative=FALSE.

Value

A list containing two dataframes empirical and theoretical. The first dataframe empirical contains columns covariate and value giving the coordinates of the lurking variable plot. The second dataframe theoretical contains columns covariate, mean and sd giving the coordinates of the plot of the theoretical mean and standard deviation.

Details

This function generates a `lurking variable' plot for a fitted point process model. Residuals from the model represented by object are plotted against the covariate specified by covariate. This plot can be used to reveal departures from the fitted model, in particular, to reveal that the point pattern depends on the covariate.

First the residuals from the fitted model (Baddeley et al, 2004) are computed at each quadrature point, or alternatively the `exponential energy marks' (Stoyan and Grabarnik, 1991) are computed at each data point. The argument type selects the type of residual or weight. See diagnose.ppm for options and explanation.

A lurking variable plot for point processes (Baddeley et al, 2004) displays either the cumulative sum of residuals/weights (if cumulative = TRUE) or a kernel-weighted average of the residuals/weights (if cumulative = FALSE) plotted against the covariate. The empirical plot (solid lines) is shown together with its expected value assuming the model is true (dashed lines) and optionally also the pointwise two-standard-deviation limits (dotted lines). To be more precise, let $Z(u)$ denote the value of the covariate at a spatial location $u$.

Ifcumulative=TRUEthen we plot$H(z)$against$z$, where$H(z)$is the sum of the residuals over all quadrature points where the covariate takes a value less than or equal to$z$, or the sum of the exponential energy weights over all data points where the covariate takes a value less than or equal to$z$.
Ifcumulative=FALSEthen we plot$h(z)$against$z$, where$h(z)$is the derivative of$H(z)$, computed approximately by spline smoothing.

For the point process residuals $E(H(z)) = 0$, while for the exponential energy weights $E(H(z)) =$ area of the subset of the window satisfying $Z(u) <= z$.="" <="" p="">

If the empirical and theoretical curves deviate substantially from one another, the interpretation is that the fitted model does not correctly account for dependence on the covariate. The correct form (of the spatial trend part of the model) may be suggested by the shape of the plot. If plot.sd = TRUE, then superimposed on the lurking variable plot are the pointwise two-standard-deviation error limits for $H(x)$ calculated for the inhomogeneous Poisson process. The default is plot.sd = TRUE for Poisson models and plot.sd = FALSE for non-Poisson models.

By default, the two-standard-deviation limits are calculated from the exact formula for the asymptotic variance of the residuals under the asymptotic normal approximation, equation (37) of Baddeley et al (2006). However, for compatibility with the original paper of Baddeley et al (2005), if oldstyle=TRUE, the two-standard-deviation limits are calculated using the innovation variance, an over-estimate of the true variance of the residuals.

The argument object must be a fitted point process model (object of class "ppm") typically produced by the maximum pseudolikelihood fitting algorithm ppm).

The argument covariate is either a numeric vector, a pixel image, or an R language expression. If it is a numeric vector, it is assumed to contain the values of the covariate for each of the quadrature points in the fitted model. The quadrature points can be extracted by quad.ppm(object).

If covariate is a pixel image, it is assumed to contain the values of the covariate at each location in the window. The values of this image at the quadrature points will be extracted.

Alternatively, if covariate is an expression, it will be evaluated in the same environment as the model formula used in fitting the model object. It must yield a vector of the same length as the number of quadrature points. The expression may contain the terms x and y representing the cartesian coordinates, and may also contain other variables that were available when the model was fitted. Certain variable names are reserved words; see ppm.

Note that lurking variable plots for the $x$ and $y$ coordinates are also generated by diagnose.ppm, amongst other types of diagnostic plots. This function is more general in that it enables the user to plot the residuals against any chosen covariate that may have been present.

For advanced use, even the values of the residuals/weights can be altered. If the argument rv is present, the residuals will not be calculated from the fitted model object but will instead be taken directly from rv. The vector rv should be numeric, and its length should equal the number of data points in the original point pattern (if type="eem") or the number of quadrature points used to fit the model (otherwise).

References

Baddeley, A., Turner, R., Moller, J. and Hazelton, M. (2005) Residual analysis for spatial point processes. Journal of the Royal Statistical Society, Series B 67, 617--666.

Baddeley, A., Moller, J. and Pakes, A.G. (2006) Properties of residuals for spatial point processes. To appear. Stoyan, D. and Grabarnik, P. (1991) Second-order characteristics for stochastic structures connected with Gibbs point processes. Mathematische Nachrichten, 151:95--100.

Examples

Run this code

data(nztrees)
  fit <- ppm(nztrees, ~x, Poisson())
  lurking(fit, expression(x))
  lurking(fit, expression(x), cumulative=FALSE)

Run the code above in your browser using DataLab