`reachability`

takes one or more (possibly directed) graphs as input, producing the associated reachability matrices.

`reachability(dat, geodist.precomp=NULL, return.as.edgelist=FALSE, na.omit=TRUE)`

dat

one or more graphs (directed or otherwise).

geodist.precomp

optionally, a precomputed `geodist`

object.

return.as.edgelist

logical; return the result as an sna edgelist?

na.omit

logical; omit missing edges when computing reach?

A reachability matrix, or the equivalent edgelist representation

For a digraph \(G=(V,E)\) with vertices \(i\) and \(j\), let \(P_{ij}\) represent a directed \(ij\) path. Then the (di)graph

$$ R = \left(V\left(G\right),\left\{\left(i,j\right):i,j \in V\left(G\right), P_{ij} \in G\right\}\right)$$

is said to be the *reachability graph* of \(G\), and the adjacency matrix of \(R\) is said to be \(G\)'s *reachability matrix*. (Note that when \(G\) is undirected, we simply take each undirected edge to be bidirectional.) Vertices which are adjacent in the reachability graph are connected by one or more directed paths in the original graph; thus, structural equivalence classes in the reachability graph are synonymous with strongly connected components in the original structure.

Bear in mind that -- as with all matters involving connectedness -- reachability is strongly related to size and density. Since, for any given density, almost all structures of sufficiently large size are connected, reachability graphs associated with large structures will generally be complete. Measures based on the reachability graph, then, will tend to become degenerate in the large \(|V(G)|\) limit (assuming constant positive density).

By default, `reachability`

will try to build the reachability graph using an internal sparse graph approximation; this is no help on fully connected graphs (but not a lot worse than using an adjacency matrix), but will result in considerable savings for large graphs that are heavily fragmented. (The intended design tradeoff is thus that one pays a small cost on the usually cheap cases, in exchange for much greater efficiency on the cases that would otherwise be prohibitively expensive.) If `geodist.precomp`

is given, however, the \(O(N^2)\) cost of an adjacency matrix representation has already been paid, and we simply employ what we are given -- so, if you want to force the internal use of adjacency matrices, just pass a `geodist`

object. Because the internal representation used is otherwise list based, using `return.as.edgelist=TRUE`

will save resources; if you are using `reachability`

as part of a more complex series of calls, it is thus recommended that you both pass and return sna edgelists unless you have a good reason not to do so.

When set, `na.omit`

results in missing edges (i.e., edges with `NA`

values) being removed prior to computation. Since paths are not recomputed when `geodist.precomp`

is passed, this option is only active when `geodist.precomp==NULL`

; if this behavior is desired and precomputed distances are being used, such edges should be removed prior to the `geodist`

call.

Wasserman, S., and Faust, K. (1994). *Social Network Analysis: Methods and Applications.* Cambridge: Cambridge University Press.

```
# NOT RUN {
#Find the reachability matrix for a sparse random graph
g<-rgraph(10,tprob=0.15)
rg<-reachability(g)
g #Compare the two structures
rg
#Compare to the output of geodist
all(rg==(geodist(g)$counts>0))
# }
```

Run the code above in your browser using DataLab