Learn R Programming

nnspat (version 0.1.2)

aij.theta: Closeness or Proximity Matrix for Tango's Spatial Clustering Tests

Description

This function computes the \(A=a_{ij}(\theta)\) matrix useful in calculations for Tango's test \(T(\theta)\) for spatial (disease) clustering (see Eqn (2) of tango:2007;textualnnspat. Here, \(A=a_{ij}(\theta)\) is any matrix of a measure of the closeness between two points \(i\) and \(j\) with \(aii = 0\) for all \(i = 1, \ldots,n\), and \(\theta = (\theta_1,\ldots,\theta_p)^t\) denotes the unknown parameter vector related to cluster size and \(\delta = (\delta_1,\ldots,\delta_n)^t\), where \(\delta_i=1\) if \(z_i\) is a case and 0 otherwise. The test is then $$T(\theta)=\sum_{i=1}^n\sum_{j=1}^n\delta_i \delta_j a_{ij}(\theta)=\delta^t A(\theta) \delta$$ where \(A=a_{ij}(\theta)\).

\(T(\theta)\) becomes Cuzick and Edwards \(T_k\) tests statistic (cuzick:1990;textualnnspat), if \(a_{ij}=1\) if \(z_j\) is among the kNNs of \(z_i\) and 0 otherwise. In this case \(\theta=k\) and aij.theta becomes aij.mat (more specifically, aij.mat(dat,k) and aij.theta(dat,k,model="NN").

In Tango's exponential clinal model (tango:2000;textualnnspat), \(a_{ij}=\exp\left(-4 \left(\frac{d_{ij}}{\theta}\right)^2\right)\) if \(i \ne j\) and 0 otherwise, where \(\theta\) is a predetermined scale of cluster such that any pair of cases far apart beyond the distance \(\theta\) cannot be considered as a cluster and \(d_{ij}\) denote the Euclidean distance between two points \(i\) and \(j\).

In the exponential model (tango:2007;textualnnspat), \(a_{ij}=\exp\left(-\frac{d_{ij}}{\theta}\right)\) if \(i \ne j\) and 0 otherwise, where \(\theta\) and \(d_{ij}\) are as above.

In the hot-spot model (tango:2007;textualnnspat), \(a_{ij}=1\) if \(d_{ij} \le \theta\) and \(i \ne j\) and 0 otherwise, where \(\theta\) and \(d_{ij}\) are as above.

The argument model has four options, NN, exp.clinal, exponential, and hot.spot, with exp.clinal being the default. And the theta argument specifies the scale of clustering or the clustering parameter in the particular spatial disease clustering model.

See also (tango:2007;textualnnspat) and the references therein.

Usage

aij.theta(dat, theta, model = "exp.clinal", ...)

Value

The \(A=a_{ij}(\theta)\) matrix useful in calculations for Tango's test \(T(\theta)\).

Arguments

dat

The data set in one or higher dimensions, each row corresponds to a data point.

theta

A predetermined cluster scale so that any pair of cases farther apart then the distance \(\theta\) is unlikely to be cluster.

model

Type of Tango's spatial clustering model with four options: NN, exp.clinal (default), exponential, and hot.spot.

...

are for further arguments, such as method and p, passed to the dist function.

Author

Elvan Ceyhan

References

See Also

aij.mat, aij.nonzero and ceTk

Examples

Run this code
n<-20  #or try sample(1:20,1)
Y<-matrix(runif(3*n),ncol=3)
k<-3#1 #try also 2,3

#aij for CE's Tk
Aij<-aij.theta(Y,k,model = "NN")
Aij2<-aij.mat(Y,k)
sum(abs(Aij-Aij2)) #check equivalence of aij.theta and aij.mat with model="NN"

Aij<-aij.theta(Y,k,method="max")
Aij2<-aij.mat(Y,k)
range(Aij-Aij2)

theta=.2
aij.theta(Y,theta,model = "exp.clinal")
aij.theta(Y,theta,model = "exponential")
aij.theta(Y,theta,model = "hot.spot")

Run the code above in your browser using DataLab