markcorr: Mark Correlation Function

Description

Estimate the marked correlation function of a marked point pattern.

Usage

markcorr(X, f = function(m1, m2) { m1 * m2}, r=NULL,
         correction=c("isotropic", "Ripley", "translate"),
         method="density", ...,
         f1=NULL, normalise=TRUE, fargs=NULL)

Arguments

The observed point pattern. An object of class "ppp" or something acceptable to as.ppp.

Optional. Test function $f$ used in the definition of the mark correlation function. An Rfunction with at least two arguments. There is a sensible default.

Optional. Numeric vector. The values of the argument $r$ at which the mark correlation function $k_f(r)$ should be evaluated. There is a sensible default.

correction

A character vector containing any selection of the options "isotropic", "Ripley", "translate", "none" or "best". It specifies the edge correction(s) to be applied.

method

A character vector indicating the user's choice of density estimation technique to be used. Options are "density", "loess", "sm" and "smrep".

...

Arguments passed to the density estimation routine (density, loess or sm.density) selected by method.

An alternative to f. If this argument is given, then $f$ is assumed to take the form $f(u,v)=f_1(u)f_1(v)$.

normalise

If normalise=FALSE, compute only the numerator of the expression for the mark correlation.

fargs

Optional. A list of extra arguments to be passed to the function f or f1.

Value

An object of class "fv" (see fv.object). Essentially a data frame containing numeric columns
rthe values of the argument $r$ at which the mark correlation function $k_f(r)$ has been estimated
theothe theoretical value of $k_f(r)$ when the marks attached to different points are independent, namely 1
together with a column or columns named "iso" and/or "trans", according to the selected edge corrections. These columns contain estimates of the mark correlation function $k_f(r)$ obtained by the edge corrections named.

Details

By default, this command calculates an estimate of Stoyan's mark correlation $k_{mm}(r)$ for the point pattern.

Alternatively if the argument f or f1 is given, then it calculates Stoyan's generalised mark correlation $k_f(r)$ with test function $f$.

Theoretical definitions are as follows (see Stoyan and Stoyan (1994, p. 262)):

For a point process eqn{X} with numeric marks, Stoyan's mark correlation function$k_{mm}(r)$, is$$k_{mm}(r) = \frac{E_{0u}[M(0) M(u)]}{E[M,M']}$$where$E_{0u}$denotes the conditional expectation given that there are points of the process at the locations$0$and$u$separated by a distance$r$, and where$M(0),M(u)$denote the marks attached to these two points. On the denominator,$M,M'$are random marks drawn independently from the marginal distribution of marks, and$E$is the usual expectation.
For a multitype point process$X$, the mark correlation is$$k_{mm}(r) = \frac{P_{0u}[M(0) M(u)]}{P[M = M']}$$where$P$and$P_{0u}$denote the probability and conditional probability.
Thegeneralisedmark correlation function$k_f(r)$of a marked point process$X$, with test function$f$, is$$k_f(r) = \frac{E_{0u}[f(M(0),M(u))]}{E[f(M,M')]}$$

The test function $f$ is any function $f(m_1,m_2)$ with two arguments which are possible marks of the pattern, and which returns a nonnegative real value. Common choices of $f$ are: for continuous real-valued marks, $$f(m_1,m_2) = m_1 m_2$$ for discrete marks (multitype point patterns), $$f(m_1,m_2) = 1(m_1 = m_2)$$ and for marks taking values in $[0,2\pi)$, $$f(m_1,m_2) = \sin(m_1 - m_2)$$. Note that $k_f(r)$ is not a ``correlation'' in the usual statistical sense. It can take any nonnegative real value. The value 1 suggests ``lack of correlation'': if the marks attached to the points of X are independent and identically distributed, then $k_f(r) \equiv 1$. The interpretation of values larger or smaller than 1 depends on the choice of function $f$.

The argument X must be a point pattern (object of class "ppp") or any data that are acceptable to as.ppp. It must be a marked point pattern.

The argument f determines the function to be applied to pairs of marks. It has a sensible default, which depends on the kind of marks in X. If the marks are numeric values, then f <- function(m1, m2) { m1 * m2} computes the product of two marks. If the marks are a factor (i.e. if X is a multitype point pattern) then f <- function(m1, m2) { m1 == m2} yields the value 1 when the two marks are equal, and 0 when they are unequal. These are the conventional definitions for numerical marks and multitype points respectively.

The argument f may be specified by the user. It must be an Rfunction, accepting two arguments m1 and m2 which are vectors of equal length containing mark values (of the same type as the marks of X). (It may also take additional arguments, passed through fargs). It must return a vector of numeric values of the same length as m1 and m2. The values must be non-negative, and NA values are not permitted.

Alternatively the user may specify the argument f1 instead of f. This indicates that the test function $f$ should take the form $f(u,v)=f_1(u)f_1(v)$ where $f_1(u)$ is given by the argument f1. The argument f1 should be an Rfunction with at least one argument. (It may also take additional arguments, passed through fargs). The argument r is the vector of values for the distance $r$ at which $k_f(r)$ is estimated.

This algorithm assumes that X can be treated as a realisation of a stationary (spatially homogeneous) random spatial point process in the plane, observed through a bounded window. The window (which is specified in X as X$window) may have arbitrary shape.

Biases due to edge effects are treated in the same manner as in Kest. The edge corrections implemented here are [object Object],[object Object] Note that the estimator assumes the process is stationary (spatially homogeneous).

The numerator and denominator of the mark correlation function (in the expression above) are estimated using density estimation techniques. The user can choose between [object Object],[object Object],[object Object],[object Object] If normalise=FALSE then the algorithm will compute only the numerator $$c_f(r) = E_{0u} f(M(0),M(u))$$ of the expression for the mark correlation function.

References

Stoyan, D. and Stoyan, H. (1994) Fractals, random shapes and point fields: methods of geometrical statistics. John Wiley and Sons.

Examples

Run this code

# CONTINUOUS-VALUED MARKS:
    # (1) Spruces
    # marks represent tree diameter
    data(spruces)
    # mark correlation function
    ms <- markcorr(spruces)
    plot(ms)

    # (2) simulated data with independent marks
    X <- rpoispp(100)
    X <- X %mark% runif(X$n)
    Xc <- markcorr(X)
    plot(Xc)
    
    # MULTITYPE DATA:
    # Hughes' amacrine data
    # Cells marked as 'on'/'off'
    data(amacrine)
    # (3) Kernel density estimate with Epanecnikov kernel
    # (as proposed by Stoyan & Stoyan)
    M <- markcorr(amacrine, function(m1,m2) {m1==m2},
                  correction="translate", method="density",
                  kernel="epanechnikov")
    plot(M)
    # Note: kernel="epanechnikov" comes from help(density)

    # (4) Same again with explicit control over bandwidth
    M <- markcorr(amacrine, 
                  correction="translate", method="density",
                  kernel="epanechnikov", bw=0.02)
    # see help(density) for correct interpretation of 'bw'

   <testonly>data(betacells)
    betacells <- betacells[seq(1,betacells$n,by=3)]
    niets <- markcorr(betacells, function(m1,m2){m1 == m2}, method="loess")
    niets <- markcorr(X, correction="isotropic", method="smrep", hmult=2)</testonly>

Run the code above in your browser using DataLab