dppm: Fit Determinantal Point Process Model

Description

Fit a determinantal point process model to a point pattern.

Usage

dppm(formula, family, data=NULL,
       ...,
       startpar = NULL,
       method = c("mincon", "clik2", "palm"),
       weightfun=NULL,
       control=list(),
       algorithm="Nelder-Mead",
       statistic="K",
       statargs=list(),
       rmax = NULL,
       covfunargs=NULL,
       use.gam=FALSE,
       nd=NULL, eps=NULL)

Arguments

formula

A formula in the R language specifying the data (on the left side) and the form of the model to be fitted (on the right side). For a stationary model it suffices to provide a point pattern without a formula. See Details.

family

Information specifying the family of point processes to be used in the model. Typically one of the family functions dppGauss, dppMatern, dppCauchy, dppBessel or dppPowerExp. Alternatively a character string giving the name of a family function, or the result of calling one of the family functions. See Details.

data

The values of spatial covariates (other than the Cartesian coordinates) required by the model. A named list of pixel images, functions, windows, tessellations or numeric constants.

…

Additional arguments. See Details.

startpar

Named vector of starting parameter values for the optimization.

method

The fitting method. Either "mincon" for minimum contrast, "clik2" for second order composite likelihood, or "palm" for Palm likelihood. Partially matched.

weightfun

Optional weighting function $w$ in the composite likelihood or Palm likelihood. A function in the R language. See Details.

control

List of control parameters passed to the optimization function optim.

algorithm

Character string determining the mathematical optimisation algorithm to be used by optim. See the argument method of optim.

statistic

Name of the summary statistic to be used for minimum contrast estimation: either "K" or "pcf".

statargs

Optional list of arguments to be used when calculating the statistic. See Details.

rmax

Maximum value of interpoint distance to use in the composite likelihood.

covfunargs,use.gam,nd,eps

Arguments passed to ppm when fitting the intensity.

Value

An object of class "dppm" representing the fitted model. There are methods for printing, plotting, predicting and simulating objects of this class.

Details

This function fits a determinantal point process model to a point pattern dataset as described in Lavancier et al. (2015).

The model to be fitted is specified by the arguments formula and family.

The argument formula should normally be a formula in the R language. The left hand side of the formula specifies the point pattern dataset to which the model should be fitted. This should be a single argument which may be a point pattern (object of class "ppp") or a quadrature scheme (object of class "quad"). The right hand side of the formula is called the trend and specifies the form of the logarithm of the intensity of the process. Alternatively the argument formula may be a point pattern or quadrature scheme, and the trend formula is taken to be ~1.

The argument family specifies the family of point processes to be used in the model. It is typically one of the family functions dppGauss, dppMatern, dppCauchy, dppBessel or dppPowerExp. Alternatively it may be a character string giving the name of a family function, or the result of calling one of the family functions. A family function belongs to class "detpointprocfamilyfun". The result of calling a family function is a point process family, which belongs to class "detpointprocfamily".

The algorithm first estimates the intensity function of the point process using ppm. If the trend formula is ~1 (the default if a point pattern or quadrature scheme is given rather than a "formula") then the model is homogeneous. The algorithm begins by estimating the intensity as the number of points divided by the area of the window. Otherwise, the model is inhomogeneous. The algorithm begins by fitting a Poisson process with log intensity of the form specified by the formula trend. (See ppm for further explanation).

The interaction parameters of the model are then fitted either by minimum contrast estimation, or by maximum composite likelihood.

Minimum contrast:

If method = "mincon" (the default) interaction parameters of the model will be fitted by minimum contrast estimation, that is, by matching the theoretical $K$-function of the model to the empirical $K$-function of the data, as explained in mincontrast.

For a homogeneous model ( trend = ~1 ) the empirical $K$-function of the data is computed using Kest, and the interaction parameters of the model are estimated by the method of minimum contrast.

For an inhomogeneous model, the inhomogeneous $K$ function is estimated by Kinhom using the fitted intensity. Then the interaction parameters of the model are estimated by the method of minimum contrast using the inhomogeneous $K$ function. This two-step estimation procedure is heavily inspired by Waagepetersen (2007).

If statistic="pcf" then instead of using the $K$-function, the algorithm will use the pair correlation function pcf for homogeneous models and the inhomogeneous pair correlation function pcfinhom for inhomogeneous models. In this case, the smoothing parameters of the pair correlation can be controlled using the argument statargs, as shown in the Examples.

Additional arguments … will be passed to clusterfit to control the minimum contrast fitting algorithm.

Composite likelihood:

If method = "clik2" the interaction parameters of the model will be fitted by maximising the second-order composite likelihood (Guan, 2006). The log composite likelihood is $$ \sum_{i,j} w(d_{ij}) \log\rho(d_{ij}; \theta) - \left( \sum_{i,j} w(d_{ij}) \right) \log \int_D \int_D w(\|u-v\|) \rho(\|u-v\|; \theta)\, du\, dv $$ where the sums are taken over all pairs of data points $x_i, x_j$ separated by a distance $d_{ij} = \| x_i - x_j\|$ less than rmax, and the double integral is taken over all pairs of locations $u,v$ in the spatial window of the data. Here $\rho(d;\theta)$ is the pair correlation function of the model with cluster parameters $\theta$.

The function $w$ in the composite likelihood is a weighting function and may be chosen arbitrarily. It is specified by the argument weightfun. If this is missing or NULL then the default is a threshold weight function, $w(d) = 1(d \le R)$, where $R$ is rmax/2.

Palm likelihood:

If method = "palm" the interaction parameters of the model will be fitted by maximising the Palm loglikelihood (Tanaka et al, 2008) $$ \sum_{i,j} w(x_i, x_j) \log \lambda_P(x_j \mid x_i; \theta) - \int_D w(x_i, u) \lambda_P(u \mid x_i; \theta) {\rm d} u $$ with the same notation as above. Here $\lambda_P(u|v;\theta$ is the Palm intensity of the model at location $u$ given there is a point at $v$.

In all three methods, the optimisation is performed by the generic optimisation algorithm optim. The behaviour of this algorithm can be modified using the argument control. Useful control arguments include trace, maxit and abstol (documented in the help for optim).

Finally, it is also possible to fix any parameters desired before the optimisation by specifying them as name=value in the call to the family function. See Examples.

References

Lavancier, F. Moller, J. and Rubak, E. (2015) Determinantal point process models and statistical inference Journal of the Royal Statistical Society, Series B 77, 853--977.

Guan, Y. (2006) A composite likelihood approach in fitting spatial point process models. Journal of the American Statistical Association 101, 1502--1512.

Tanaka, U. and Ogata, Y. and Stoyan, D. (2008) Parameter estimation and model selection for Neyman-Scott point processes. Biometrical Journal 50, 43--57.

Waagepetersen, R. (2007) An estimating function approach to inference for inhomogeneous Neyman-Scott processes. Biometrics 63, 252--258.

Examples

Run this code

# NOT RUN {
  jpines <- residualspaper$Fig1
  
# }
# NOT RUN {
  dppm(jpines ~ 1, dppGauss)

  dppm(jpines ~ 1, dppGauss, method="c")
  dppm(jpines ~ 1, dppGauss, method="p")

  # Fixing the intensity to lambda=2 rather than the Poisson MLE 2.04:
  dppm(jpines ~ 1, dppGauss(lambda=2))

  if(interactive()) {
   # The following is quite slow (using K-function)
   dppm(jpines ~ x, dppMatern)
  }

   # much faster using pair correlation function
  dppm(jpines ~ x, dppMatern, statistic="pcf", statargs=list(stoyan=0.2))

  # Fixing the Matern shape parameter to nu=2 rather than estimating it:
  dppm(jpines ~ x, dppMatern(nu=2))
# }

Run the code above in your browser using DataLab