pcf.ppp
Pair Correlation Function of Point Pattern
Estimates the pair correlation function of a point pattern using kernel methods.
- Keywords
- spatial, nonparametric
Usage
## S3 method for class 'ppp':
pcf(X, \dots, r = NULL, kernel="epanechnikov", bw=NULL,
stoyan=0.15,
correction=c("translate", "Ripley"),
divisor = c("r", "d"),
domain=NULL)
Arguments
- X
- A point pattern (object of class
"ppp"
). - r
- Vector of values for the argument $r$ at which $g(r)$ should be evaluated. There is a sensible default.
- kernel
- Choice of smoothing kernel,
passed to
density
. - bw
- Bandwidth for smoothing kernel, passed to
density
. - ...
- Other arguments passed to the kernel density estimation
function
density
. - stoyan
- Bandwidth coefficient; see Details.
- correction
- Choice of edge correction.
- divisor
- Choice of divisor in the estimation formula:
either
"r"
(the default) or"d"
. See Details. - domain
- Optional. Calculations will be restricted to this subset of the window. See Details.
Details
The pair correlation function $g(r)$
is a summary of the dependence between points in a spatial point
process. The best intuitive interpretation is the following: the probability
$p(r)$ of finding two points at locations $x$ and $y$
separated by a distance $r$ is equal to
$$p(r) = \lambda^2 g(r) \,{\rm d}x \, {\rm d}y$$
where $\lambda$ is the intensity of the point process.
For a completely random (uniform Poisson) process,
$p(r) = \lambda^2 \,{\rm d}x \, {\rm d}y$
so $g(r) = 1$.
Formally, the pair correlation function of a stationary point process
is defined by
$$g(r) = \frac{K'(r)}{2\pi r}$$
where $K'(r)$ is the derivative of $K(r)$, the
reduced second moment function (aka ``Ripley's $K$ function'')
of the point process. See Kest
for information
about $K(r)$.
For a stationary Poisson process, the pair correlation function is identically equal to 1. Values $g(r) < 1$ suggest inhibition between points; values greater than 1 suggest clustering.
This routine computes an estimate of $g(r)$ by kernel smoothing.
- If
divisor="r"
(the default), then the standard kernel estimator (Stoyan and Stoyan, 1994, pages 284--285) is used. By default, the recommendations of Stoyan and Stoyan (1994) are followed exactly. - If
divisor="d"
then a modified estimator is used: the contribution from an interpoint distance$d_{ij}$to the estimate of$g(r)$is divided by$d_{ij}$instead of dividing by$r$. This usually improves the bias of the estimator when$r$is close to zero.
There is also a choice of spatial edge corrections (which are needed to avoid bias due to edge effects associated with the boundary of the spatial window):
- If
correction="translate"
orcorrection="translation"
then the translation correction is used. Fordivisor="r"
the translation-corrected estimate is given in equation (15.15), page 284 of Stoyan and Stoyan (1994). - If
correction="Ripley"
then Ripley's isotropic edge correction is used. Fordivisor="r"
the isotropic-corrected estimate is given in equation (15.18), page 285 of Stoyan and Stoyan (1994). - If
correction=c("translate", "Ripley")
then both estimates will be computed.
kernel
which is passed to density
.
The default is the Epanechnikov kernel, recommended by
Stoyan and Stoyan (1994, page 285). The bandwidth of the smoothing kernel can be controlled by the
argument bw
. Its precise interpretation
is explained in the documentation for density.default
.
For the Epanechnikov kernel, the argument bw
is
equivalent to $h/\sqrt{5}$.
Stoyan and Stoyan (1994, page 285) recommend using the Epanechnikov
kernel with support $[-h,h]$ chosen by the rule of thumn
$h = c/\sqrt{\lambda}$,
where $\lambda$ is the (estimated) intensity of the
point process, and $c$ is a constant in the range from 0.1 to 0.2.
See equation (15.16).
If bw
is missing, then this rule of thumb will be applied.
The argument stoyan
determines the value of $c$.
The argument r
is the vector of values for the
distance $r$ at which $g(r)$ should be evaluated.
There is a sensible default.
If it is specified, r
must be a vector of increasing numbers
starting from r[1] = 0
,
and max(r)
must not exceed half the diameter of
the window.
If the argument domain
is given, estimation will be restricted
to this region. That is, the estimate of
$g(r)$ will be based on pairs of points in which the first point lies
inside domain
and the second point is unrestricted.
The argument domain
should be a window (object of class "owin"
) or something acceptable to
as.owin
. It must be a subset of the
window of the point pattern X
.
To compute a confidence band for the true value of the
pair correlation function, use lohboot
.
Value
- A function value table
(object of class
"fv"
). Essentially a data frame containing the variables r the vector of values of the argument $r$ at which the pair correlation function $g(r)$ has been estimated theo vector of values equal to 1, the theoretical value of $g(r)$ for the Poisson process trans vector of values of $g(r)$ estimated by translation correction iso vector of values of $g(r)$ estimated by Ripley isotropic correction - as required.
References
Stoyan, D. and Stoyan, H. (1994) Fractals, random shapes and point fields: methods of geometrical statistics. John Wiley and Sons.
See Also
Examples
data(simdat)
<testonly>simdat <- simdat[seq(1,simdat$n, by=4)]</testonly>
p <- pcf(simdat)
plot(p, main="pair correlation function for simdat")
# indicates inhibition at distances r < 0.3
pd <- pcf(simdat, divisor="d")
# compare estimates
plot(p, cbind(iso, theo) ~ r, col=c("blue", "red"),
ylim.covers=0, main="", lwd=c(2,1), lty=c(1,3), legend=FALSE)
plot(pd, iso ~ r, col="green", lwd=2, add=TRUE)
legend("center", col=c("blue", "green"), lty=1, lwd=2,
legend=c("divisor=r","divisor=d"))