nacf
computes the sample network covariance/correlation function for a specified variable on a given input network. Moran's $I$ and Geary's $C$ statistics at multiple orders may be computed as well.
nacf(net, y, lag.max = NULL, type = c("correlation", "covariance", "moran", "geary"), neighborhood.type = c("in", "out", "total"), partial.neighborhood = TRUE, mode = "digraph", diag = FALSE, thresh = 0, demean = TRUE)
net
. net
-1). neighborhood
). "digraph"
for directed graphs, or "graph"
if net
is undirected. net
contain valid data?net
. y
prior to analysis? nacf
computes dependence statistics for the vector y
on network net
, for neighborhoods of various orders. Specifically, let $A_i$ be the $i$th order adjacency matrix of net
. The sample network autocovariance of $y$ on $A_i$ is then given by
$$
\sigma_i = \frac{\mathbf{y}^T \mathbf{A}_i \mathbf{y}}{E},
$$
where $E = sum(A_i)$. Similarly, the sample network autocorrelation in the above case is $sigma_i/sigma_0$, where $sigma_0$ is the variance of $y$. Moran's $I$ and Geary's $C$ statistics are defined in the usual fashion as
$$
I_i = \frac{N \sum_{j=1}^N \sum_{k=1}^N (y_j-\bar{y}) (y_k-\bar{y}) A_{ijk}}{E \sum_{j=1}^N y_j^2},
$$
and
$$
C_i = \frac{(N-1) \sum_{j=1}^N \sum_{k=1}^N (y_j-y_k)^2 A_{ijk}}{2 E \sum_{j=1}^N (y-\bar{y})^2}
$$
respectively, where $N$ is the order of $A_i$ and $ybar$ is the mean of $y$.
The adjacency matrix associated with the $i$th order neighborhood is defined as the identity matrix for order 0, and otherwise depends on the type of neighborhood involved. For input graph $G=(V,E)$, let the base relation, $R$, be given by the underlying graph of $G$ (i.e., $G U G^T$) if total neighborhoods are sought, the transpose of $G$ if incoming neighborhoods are sought, or $G$ otherwise. The partial neighborhood structure of order $i>0$ on $R$ is then defined to be the digraph on $V$ whose edge set consists of the ordered pairs $(j,k)$ having geodesic distance $i$ in $R$. The corresponding cumulative neighborhood is formed by the ordered pairs having geodesic distance less than or equal to $i$ in $R$. For purposes of nacf
, these neighborhoods are calculated using neighborhood
, with the specified parameters (including dichotomization at thresh
).The return value for nacf
is the selected dependence statistic, calculated for each neighborhood structure from order 0 (the identity) through order lag.max
(or $N-1$, if lag.max==NULL
). This vector can be used much like the conventional autocorrelation function, to identify dependencies at various lags. This may, in turn, suggest a starting point for modeling via routines such as lnam
.
Moran, P.A.P. (1950). Notes on Continuous Stochastic Phenomena. Biometrika, 37: 17-23.
geodist
, gapply
, neighborhood
, lnam
, acf
#Create a random graph, and an autocorrelated variable
g<-rgraph(50,tp=4/49)
y<-qr.solve(diag(50)-0.8*g,rnorm(50,0,0.05))
#Examine the network autocorrelation function
nacf(g,y) #Partial neighborhoods
nacf(g,y,partial.neighborhood=FALSE) #Cumulative neighborhoods
#Repeat, using Moran's I on the underlying graph
nacf(g,y,type="moran")
nacf(g,y,partial.neighborhood=FALSE,type="moran")
Run the code above in your browser using DataLab