Function that calculates, for a specified node pair representing endpoints, path statistics from a sparse precision matrix. The sparse precision matrix is taken to represent the conditional independence graph of a Gaussian graphical model. The contribution to the observed covariance between the specified endpoints is calculated for each (heuristically) determined path between the endpoints.
GGMpathStats(P0, node1, node2, neiExpansions = 2, verbose = TRUE, graph = TRUE,
nrPaths = 2, lay = layout.circle, nodecol = "skyblue", Vsize = 15,
Vcex = 0.6, VBcolor = "darkblue", VLcolor = "black",
all.edges = TRUE, prune = TRUE, legend = TRUE, scale = 1,
Lcex = 0.8, PTcex = 2, main = "")Sparse (possibly standardized) precision matrix.
A numeric specifying an endpoint. The numeric should correspond to a row/column of the precision matrix
and as such represents the corresponding variable.
A numeric specifying a second endpoint. The numeric should correspond to a row/column of the precision matrix
and as such represents the corresponding variable.
A numeric determining how many times the neighborhood around the respective endpoints should be expanded
in the search for shortest paths between the node pair.
A logical indicating if a summary of the results should be printed on screen.
A logical indicating if the strongest paths should be visualized with a graph.
A numeric indicating the number of paths (with the highest contribution to the marginal covariance
between the indicated node pair) to be visualized/highlighted.
Function call to igraph determining the placement of vertices.
A character determining the color of node1 and node2.
A numeric determining the vertex size.
A numeric determining the size of the vertex labels.
A character determining the color of the vertex borders.
A character determining the color of the vertex labels.
A logical indicating if edges other than those implied by the nrPaths-paths between
node1 and node2 should also be visualized.
A logical determining if vertices of degree 0 should be removed.
A logical indicating if the graph should come with a legend.
A numeric representing a scale factor for visualizing strenght of edges.
It is a relative scaling factor, in the sense that the edges implied by the nrPaths-paths between
node1 and node2 have edge thickness that is twice this scaling factor (so it is
a scaling factor vis-a-vis the unimplied edges).
A numeric determining the size of the legend box.
A numeric determining the size of the exemplary lines in the legend box.
A character giving the main figure title.
An object of class list:
A matrix specifying the paths, their respective lengths, and their respective contributions to
the marginal covariance between the endpoints.
A list representing the respective paths as numeric vectors.
A data.frame in which each numeric from paths is connected to an identifier such as a variable name.
The conditional independence graph (as implied by the sparse precision matrix) is undirected. In undirected
graphs origin and destination are interchangeable and are both referred to as 'endpoints' of a path. The
function searches for shortest paths between the specified endpoints node1 and node2.
It searches for shortest paths that visit nodes only once. The shortest paths
between the provided endpoints are determined heuristically by the following procedure. The search is initiated
by application of the get.all.shortest.paths-function from the igraph-package,
which yields all shortest paths between the nodes. Next, the neighborhoods of the endpoints are defined
(excluding the endpoints themselves). Then, the shortest paths are found between: (a)
node1 and node Vs in its neighborhood; (b) node Vs in the node1-neighborhood and node
Ve in the node2-neighborhood; and (c) node Ve in the node2-neighborhood and node2.
These paths are glued and new shortest path candidates are obtained (preserving only novel paths). In additional
iterations (specified by neiExpansions) the node1- and node2-neighborhood are expanded by
including their neighbors (still excluding the endpoints) and shortest paths are again
searched as described above.
The contribution of a particular path to the observed covariance between the specified
node pair is calculated in accordance with Theorem 1 of Jones and West (2005). As in Jones and West (2005),
paths whose weights have an opposite sign to the marginal covariance
(between endnodes of the path) are referred to as 'moderating paths' while paths whose weights
have the same sign as the marginal covariance are referred to as 'mediating' paths. Such
paths are visualized when graph = TRUE.
All arguments following the graph argument are only (potentially) used when graph = TRUE.
When graph = TRUE the conditional independence graph is returned with the paths highlighted that have the
highest contribution to the marginal covariance between the specified endpoints. The number of paths highlighted
is indicated by nrPaths. The edges of mediating paths are represented in green while the edges of moderating
paths are represented in red. When all.edges = TRUE the edges other than those implied by the nrPaths-paths between
node1 and node2 are also visualized (in lightgrey). When all.edges = FALSE only the mediating and
moderating paths implied by nrPaths are visualized.
The default layout gives a circular placement of the vertices. All layout functions supported by
igraph are supported. The arguments Lcex and PTcex are only used when legend = TRUE.
If prune = TRUE the vertices of degree 0 (vertices not implicated by any edge) are removed. For the colors supported
by the arguments nodecol, Vcolor, and VBcolor, see
https://stat.columbia.edu/~tzheng/files/Rcolor.pdf.
Eppstein, D. (1998). Finding the k Shortest Paths. SIAM Journal on computing 28: 652-673.
Jones, B., and West, M. (2005). Covariance Decomposition in Undirected Gaussian Graphical Models. Biometrika 92: 779-786.
# NOT RUN {
## Obtain some (high-dimensional) data
p <- 25
n <- 10
set.seed(333)
X <- matrix(rnorm(n*p), nrow = n, ncol = p)
colnames(X) <- letters[1:p]
## Obtain regularized precision under optimal penalty
OPT <- optPenalty.LOOCVauto(X, lambdaMin = .5, lambdaMax = 30)
## Determine support regularized standardized precision under optimal penalty
PC0 <- sparsify(OPT$optPrec, threshold = "localFDR")$sparseParCor
## Obtain information on mediating and moderating paths between nodes 14 and 23
pathStats <- GGMpathStats(PC0, 14, 23, verbose = TRUE, prune = FALSE)
pathStats
# }
Run the code above in your browser using DataLab