# Gest

##### Nearest Neighbour Distance Function G

Estimates the nearest neighbour distance distribution function $G(r)$ from a point pattern in a window of arbitrary shape.

- Keywords
- spatial

##### Usage

```
Gest(X)
Gest(X, r)
Gest(X, breaks)
nearest.neighbour(X)
```

##### Arguments

- X
- The observed point pattern,
from which an estimate of $G(r)$ will be computed.
An object of class
`ppp`

, or data in any format acceptable to`as.ppp()`

. - r
- numeric vector. The values of the argument $r$ at which $G(r)$ should be evaluated. There is a sensible default. First-time users are strongly advised not to specify this argument. See below for important conditions on $r$.
- breaks
- An alternative to the argument
`r`

. Not normally invoked by the user. See the**Details**section.

##### Details

The nearest neighbour distance distribution function
(also called the ``*event-to-event*'' or
``*inter-event*'' distribution)
of a point process $X$
is the cumulative distribution function $G$ of the distance
from a typical random point of $X$ to
the nearest other point of $X$.

An estimate of $G$ derived from a spatial point pattern dataset can be used in exploratory data analysis and formal inference about the pattern (Cressie, 1991; Diggle, 1983; Ripley, 1988). In exploratory analyses, the estimate of $G$ is a useful statistic summarising one aspect of the ``clustering'' of points. For inferential purposes, the estimate of $G$ is usually compared to the true value of $G$ for a completely random (Poisson) point process, which is $$G(r) = 1 - e^{ - \lambda \pi r^2}$$ where $\lambda$ is the intensity (expected number of points per unit area). Deviations between the empirical and theoretical $G$ curves may suggest spatial clustering or spatial regularity.

This algorithm estimates the nearest neighbour distance distribution
function $G$
from the point pattern `X`

. It assumes that `X`

can be treated
as a realisation of a stationary (spatially homogeneous)
random spatial point process in the plane, observed through
a bounded window.
The window (which is specified in `X`

as `X$window`

)
may have arbitrary shape.

The argument `X`

is interpreted as a point pattern object
(of class `"ppp"`

, see `ppp.object`

) and can
be supplied in any of the formats recognised
by `as.ppp()`

.

The estimation of $G$ is hampered by edge effects arising from
the unobservability of points of the random pattern outside the window.
An edge correction is needed to reduce bias (Baddeley, 1998; Ripley, 1988).
The two edge corrections implemented here are the border method or
``*reduced sample*'' estimator, and the spatial Kaplan-Meier estimator
(Baddeley and Gill, 1997).

The argument `r`

is the vector of values for the
distance $r$ at which $G(r)$ should be evaluated.
It is also used to determine the breakpoints
(in the sense of `hist`

)
for the computation of histograms of distances. The reduced-sample and
Kaplan-Meier estimators are computed from histogram counts.
In the case of the Kaplan-Meier estimator this introduces a discretisation
error which is controlled by the fineness of the breakpoints.

First-time users would be strongly advised not to specify `r`

.
However, if it is specified, `r`

must satisfy `r[1] = 0`

,
and `max(r)`

must be larger than the radius of the largest disc
contained in the window. Furthermore, the successive entries of `r`

must be finely spaced.

The algorithm also returns an estimate of the hazard rate function, $\lambda(r)$, of $G(r)$. The hazard rate is defined as the derivative $$\lambda(r) = - \frac{d}{dr} \log (1 - G(r))$$ This estimate should be used with caution as $G$ is not necessarily differentiable.

The naive empirical distribution of distances from each point of
the pattern `X`

to the nearest other point of the pattern,
is a biased estimate of $G$.
However this is also returned by the algorithm, as it is sometimes
useful in other contexts. Care should be taken not to use the uncorrected
empirical $G$ as if it were an unbiased estimator of $G$.

##### Value

- A data frame containing six columns:
r the values of the argument $r$ at which the function $G(r)$ has been estimated rs the ``reduced sample'' or ``border correction'' estimator of $G(r)$ km the spatial Kaplan-Meier estimator of $G(r)$ hazard the hazard rate $\lambda(r)$ of $G(r)$ by the spatial Kaplan-Meier method raw the uncorrected estimate of $G(r)$, i.e. the empirical distribution of the distances from each point in the pattern `X`

to the nearest other point of the patterntheo the theoretical value of $G(r)$ for a stationary Poisson process of the same estimated intensity.

##### synopsis

Gest(X, r=NULL, breaks=NULL, ...)

##### Warnings

The function $G$ does not necessarily have a density. Any valid c.d.f. may appear as the nearest neighbour distance distribution function of a stationary point process.

The reduced sample estimator of $G$ is pointwise approximately unbiased, but need not be a valid distribution function; it may not be a nondecreasing function of $r$. Its range is always within $[0,1]$.

The spatial Kaplan-Meier estimator of $G$ is always nondecreasing but its maximum value may be less than $1$.

##### References

Baddeley, A.J. Spatial sampling and censoring.
In O.E. Barndorff-Nielsen, W.S. Kendall and
M.N.M. van Lieshout (eds)
*Stochastic Geometry: Likelihood and Computation*.
Chapman and Hall, 1998.
Chapter 2, pages 37-78.
Baddeley, A.J. and Gill, R.D.
Kaplan-Meier estimators of interpoint distance
distributions for spatial point processes.
*Annals of Statistics* **25** (1997) 263-292.

Cressie, N.A.C. *Statistics for spatial data*.
John Wiley and Sons, 1991.

Diggle, P.J. *Statistical analysis of spatial point patterns*.
Academic Press, 1983.

Ripley, B.D. *Statistical inference for spatial processes*.
Cambridge University Press, 1988.

Stoyan, D, Kendall, W.S. and Mecke, J.
*Stochastic geometry and its applications*.
2nd edition. Springer Verlag, 1995.

##### See Also

##### Examples

```
library(spatstat)
pp <- runifpoint(50)
Gpp <- Gest(pp)
plot(Gpp$r, Gpp$km, type="l", xlab="r", ylab="G(r)", ylim=c(0,1),
main = "nearest neighbour function")
r <- Gpp$r
lines(r, Gpp$theo, lty=2)
legend(0.5, 2, c("Kaplan-Meier estimator", "Poisson process"), lty=c(1,2))
data(cells)
Gc <- Gest(cells)
plot(Gc$r, Gc$km, type="l")
plot(km ~ r, type="l", data=Gc)
# restrict the plot to values of r less than 0.1
plot(km ~ r, type="l", data=Gc[Gc$r <= 0.1, ])
plot(km ~ r, type="l", data=Gc, subset=(r <= 0.1))
```

*Documentation reproduced from package spatstat, version 1.3-4, License: GPL version 2 or newer*