get_diagnostics: MCMC diagnostics

Description

This function provides diagnostics of convergence for the MCMC output of learn_DAG function.

Usage

get_diagnostics(learnDAG_output, ask = TRUE, nodes = integer(0))

Value

A collection of plots summarizing the behavior of the number of edges and the posterior probabilities of edge inclusion computed from the MCMC output.

Arguments

learnDAG_output: object of class bcdag
ask: Boolean argument passed to par() for visualization;
nodes: Numerical vector indicating those nodes for which we want to compute the posterior probability of edge inclusion;

Author

Federico Castelletti and Alessandro Mascaro

Details

Function learn_DAG implements a Markov Chain Monte Carlo (MCMC) algorithm for structure learning and posterior inference of Gaussian DAGs. Output of the algorithm is a collection of \(S\) DAG structures (represented as \((q,q)\) adjacency matrices) and DAG parameters \((D,L)\) approximately drawn from the joint posterior. In addition, if learn_DAG is implemented with collapse = TRUE, the only approximate marginal posterior of DAGs (represented by the collection of \(S\) DAG structures) is returned; see the documentation of learn_DAG for more details.

Diagnostics of convergence for the MCMC output are conducted by monitoring across MCMC iterations: (1) the number of edges in the DAGs; (2) the posterior probability of edge inclusion for each possible edge \(u -> v\). With regard to (1), a traceplot of the number of edges in the DAGs visited by the MCMC chain at each step \(s = 1, ..., S\) is first provided as the output of the function. The absence of trends in the plot can provide information on a genuine convergence of the MCMC chain. In addition, the traceplot of the average number of edges in the DAGs visited up to time \(s\), for \(s = 1, ..., S\), is also returned. The convergence of the curve around a "stable" average size generally suggests good convergence of the algorithm. With regard to (2), for each edge \(u -> v\), the posterior probability at time \(s\), for \(s = 1, ..., S\), can be estimated as as the proportion of DAGs visited by the MCMC up to time \(s\) which contain the directed edge \(u -> v\). Output is organized in \(q\) plots (one for each node \(v = 1, ..., q\)), each summarizing the posterior probabilities of edges \(u -> v\), \(u = 1, ..., q\). If the number of nodes is larger than 30 the traceplot of a random sample of 30 nodes is returned.

References

F. Castelletti and A. Mascaro (2021). Structural learning and estimation of joint causal effects among network-dependent variables. Statistical Methods and Applications, Advance publication.

F. Castelletti (2020). Bayesian model selection of Gaussian Directed Acyclic Graph structures. International Statistical Review 88 752-775.

Examples

Run this code

# Randomly generate a DAG and the DAG-parameters
q = 8
w = 0.2
set.seed(123)
DAG = rDAG(q = q, w = w)
outDL = rDAGWishart(n = 1, DAG = DAG, a = q, U = diag(1, q))
L = outDL$L; D = outDL$D
Sigma = solve(t(L))%*%D%*%solve(L)
n = 200
# Generate observations from a Gaussian DAG-model
X = mvtnorm::rmvnorm(n = n, sigma = Sigma)
# Run the MCMC for posterior inference of DAGs only (collapse = TRUE)
out_mcmc = learn_DAG(S = 5000, burn = 1000, a = q, U = diag(1,q)/n, data = X, w = 0.1,
                                   fast = TRUE, save.memory = FALSE, collapse = TRUE)
# Produce diagnostic plots
get_diagnostics(out_mcmc)

Run the code above in your browser using DataLab