Learn R Programming

sna (version 2.4)

kpath.census: Compute Path or Cycle Census Information

Description

kpath.census and kcycle.census compute \(k\)-path or \(k\)-cycle census statistics (respectively) on one or more input graphs. In addition to aggregate counts of paths or cycles, results may be disaggregated by vertex and co-membership information may be computed.

Usage

kcycle.census(dat, maxlen = 3, mode = "digraph", 
    tabulate.by.vertex = TRUE, cycle.comembership = c("none", "sum",
    "bylength"))

kpath.census(dat, maxlen = 3, mode = "digraph", tabulate.by.vertex = TRUE, path.comembership = c("none", "sum", "bylength"), dyadic.tabulation = c("none", "sum", "bylength"))

Arguments

cycle.comembership

the type of cycle co-membership information to be tabulated, if any. "sum" returns a vertex by vertex matrix of cycle co-membership counts; these are disaggregated by cycle length if "bylength" is used. If "none" is given, no co-membership information is computed.

dat

one or more input graphs.

maxlen

the maximum path/cycle length to evaluate.

mode

"digraph" for directed graphs, or "graph" for undirected graphs.

tabulate.by.vertex

logical; should path or cycle incidence counts be tabulated by vertex?

path.comembership

as per cycle.comembership, for paths rather than cycles.

dyadic.tabulation

the type of dyadic path count information to be tabulated, if any. "sum" returns a vertex by vertex matrix of source/destination path counts, while "bylength" disaggregates these counts by path length. Selecting "none" disables this computation.

Value

For kpath.census, a list with the following elements:

path.count

If tabulate.byvertex==FALSE, a vector of aggregate counts by path length. Otherwise, a matrix whose first column is a vector of aggregate path counts, and whose succeeding columns contain vectors of path counts for each vertex.

path.comemb

If path.comembership!="none", a matrix or array containing co-membership in paths by vertex pairs. If path.comembership=="sum", only a matrix of co-memberships is returned; if bylength is used, however, co-memberships are returned in a maxlen by \(n\) by \(n\) array whose \(i,j,k\)th cell is the number of paths of length \(i\) containing j and k.

paths.bydyad

If dyadic.tabulation!="none", a matrix or array containing the number of paths originating at a particular vertex and terminating. If bylength is used, dyadic path counts are supplied via a maxlen by \(n\) by \(n\) array whose \(i,j,k\)th cell is the number of paths of length \(i\) starting at j and ending with k. If sum is used instead, only a matrix whose \(i,j\) cell contains the total number of paths from \(i\) to \(j\) is returned.

For kcycle.census, a similar list:

cycle.count

If tabulate.byvertex==FALSE, a vector of aggregate counts by cycle length. Otherwise, a matrix whose first column is a vector of aggregate cycle counts, and whose succeeding columns contain vectors of cycle counts for each vertex.

cycle.comemb

If cycle.comembership!="none", a matrix or array containing co-membership in cycles by vertex pairs. If cycle.comembership=="sum", only a matrix of co-memberships is returned; if bylength is used, however, co-memberships are returned in a maxlen by \(n\) by \(n\) array whose \(i,j,k\)th cell is the number of cycles of length \(i\) containing j and k.

Warning

The computational cost of calculating paths and cycles grows very sharply in both maxlen and network density. Be wary of setting maxlen greater than 5-6, unless you know what you are doing. Otherwise, the expected completion time for your calculation may exceed your life expectancy (and those of subsequent generations).

Details

There are several equivalent characterizations of paths and cycles, of which the following is one example. For an arbitrary graph \(G\), a path is a sequence of distinct vertices \(v_1, v_2, \ldots, v_n\) and included edges such that \(v_i\) is adjacent to \(v_{i+1}\) for all \(i \in 1, 2, \ldots, n-1\) via the pair's included edge. (Contrast this with a walk, in which edges and/or vertices may be repeated.) A cycle is the union of a path and an edge making \(v_n\) adjacent to \(v_i\). \(k\)-paths and \(k\)-cycles are respective paths and cycles having \(k\) edges (in the former case) or \(k\) vertices (in the latter). The above definitions may be applied in both directed and undirected contexts, by substituting the appropriate notion of adjacency. (Note that authors do not always employ the same terminology for these concepts, especially in older texts -- it is wise to verify the definitions being used in any particular context.)

A subgraph census statistic is a function which, for any given graph and subgraph, gives the number of copies of the latter contained in the former. A collection of subgraph census statistics is referred to as a subgraph census; widely used examples include the dyad and triad censuses, implemented in sna by the dyad.census and triad.census functions (respectively). kpath.census and kcycle.census compute a range of census statistics related to \(k\)-paths and \(k\)-cycles, including:

  • Aggregate counts of paths/cycles by length (i.e., \(k\)).

  • Counts of paths/cycles to which each vertex belongs (when tabulate.byvertex==TRUE).

  • Counts of path/cycle co-memberships, potentially disaggregated by length (when the appropriate co-membership argument is set to bylength).

  • For path.census, counts of the total number of paths from each vertex to each other vertex, possibly disaggregated by length (if dyadic.tabulation=="bylength").

The length of the maximum-length path/cycle to compute is given by maxlen. These calculations are intrinsically expensive (path/cycle computation is NP complete in the general case), and users should hence be wary when increasing maxlen. On the other hand, it may be possible to enumerate even long paths or cycles on a very sparse graph; scaling is approximately \(c^k\), where \(k\) is given by maxlen and \(c\) is the size of the largest dense cluster.

The paths or cycles computed by this function are directed if mode=="digraph", or undirected if mode=="graph". Failing to set mode correctly may result in problematic behavior.

References

Butts, C.T. (2006). “Cycle Census Statistics for Exponential Random Graph Models.” IMBS Technical Report MBS 06-05, University of California, Irvine.

West, D.B. (1996). Introduction to Graph Theory. Upper Saddle River, N.J.: Prentice Hall.

See Also

dyad.census, triad.census, clique.census, geodist

Examples

Run this code
# NOT RUN {
g<-rgraph(20,tp=1.5/19)

#Obtain paths by vertex, with dyadic path counts
pc<-kpath.census(g,maxlen=5,dyadic.tabulation="sum")
pc$path.count                                 #Examine path counts
pc$paths.bydyad                               #Examine dyadic paths

#Obtain aggregate cycle counts, with co-membership by length
cc<-kcycle.census(g,maxlen=5,tabulate.by.vertex=FALSE,
    cycle.comembership="bylength")
cc$cycle.count                             #Examine cycle counts
cc$cycle.comemb[1,,]                       #Co-membership for 2-cycles
cc$cycle.comemb[2,,]                       #Co-membership for 3-cycles
cc$cycle.comemb[3,,]                       #Co-membership for 4-cycles
# }

Run the code above in your browser using DataLab