This function implements an iterative search for the maximum a posteriori (MAP) DAG,
by means of order MCMC (arXiv:1803.07859v3). At each iteration, the current search space is expanded by
allowing each node to have up to one additional parent not already included in the search space.
By default the initial search space is obtained through the PC-algorithm (using the functions skeleton
and pc
from the `pcalg' package [Kalisch et al, 2012]).
At each iteration order MCMC is employed to search for the MAP DAG.
The edges in the MAP DAG are added to the initial search space to provide
the search space for the next iteration. The algorithm iterates until no
further score improvements can be achieved by expanding the search space.
The final search space may be used for the sampling versions of orderMCMC
and partitionMCMC
.
iterativeMCMC(
scorepar,
MAP = TRUE,
posterior = 0.5,
softlimit = 9,
hardlimit = 12,
alpha = 0.05,
gamma = 1,
verbose = TRUE,
chainout = FALSE,
scoreout = FALSE,
cpdag = FALSE,
mergetype = "skeleton",
iterations = NULL,
moveprobs = NULL,
stepsave = NULL,
startorder = NULL,
accum = FALSE,
compress = TRUE,
plus1it = NULL,
startspace = NULL,
blacklist = NULL,
addspace = NULL,
scoretable = NULL,
alphainit = NULL
)# S3 method for iterativeMCMC
plot(
x,
...,
main = "iterative MCMC, DAG scores",
xlab = "MCMC step",
ylab = "DAG logscore",
type = "l",
col = "blue"
)
# S3 method for iterativeMCMC
print(x, ...)
# S3 method for iterativeMCMC
summary(object, ...)
Object of class iterativeMCMC
, which contains log-score trace as well as adjacency matrix of the maximum scoring DAG, its score and the order score.
The output can optionally include DAGs sampled in MCMC iterations and the score tables. Optional output is regulated by the parameters chainout
and scoreout
. See iterativeMCMC class
for a detailed class structure.
an object of class scoreparameters
, containing the data and scoring parameters; see constructor function scoreparameters
logical, if TRUE (default) the search targets the MAP DAG (a DAG with maximum score),
if FALSE at each MCMC step a DAG is sampled from the order proportionally to its score; when expanding a search space when MAP=TRUE all edges from the maximum scoring DAG are added
to the new space, when MAP=FALSE only edges with posterior probability higher than defined by parameter posterior
are added to the search space
logical, when MAP
set to FALSE defines posterior probability threshold for adding the edges to the search space
integer, limit on the size of parent sets beyond which adding undirected edges is restricted; below this
limit edges are added to expand the parent sets based on the undirected skeleton of the MAP DAG (or from its CPDAG, depending
on the parameter mergecp
), above the limit only the directed edges are added from the MAP DAG; the limit is 9 by default
integer, limit on the size of parent sets beyond which the search space is not further expanded to prevent long runtimes; the limit is 12 by default
numerical significance value in {0,1}
for the conditional independence tests in the PC-stage
tuning parameter which transforms the score by raising it to this power, 1 by default
logical, if TRUE (default) prints messages on the progress of execution
logical, if TRUE the saved MCMC steps are returned, FALSE by default
logical, if TRUE the search space from the last plus1 iterations and the corresponding score tables are returned, FALSE by default
logical, if set to TRUE the equivalence class (CPDAG) found by the PC algorithm is used as a search space, when FALSE (default) the undirected skeleton used as a search space
defines which edges are added to the search space at each expansion iteration; three options are available 'dag', 'cpdag', 'skeleton'; 'skeleton' by default
integer, the number of MCMC steps, the default value is \(3.5n^{2}\log{n}\)
a numerical vector of 4 values in {0,1}
corresponding to the probabilities of the following MCMC moves in the order space:
exchanging 2 random nodes in the order
exchanging 2 adjacent nodes in the order
placing a single node elsewhere in the order
staying still
integer, thinning interval for the MCMC chain, indicating the number of steps between two output iterations, the default is iterations
/1000
integer vector of length n, which will be used as the starting order in the MCMC algorithm, the default order is random
logical, when TRUE at each search step expansion new edges are added to the current search space; when FALSE (default) the new edges are added to the starting space
logical, if TRUE adjacency matrices representing sampled graphs will be stored as a sparse Matrix (recommended); TRUE by default
(optional) integer, a number of iterations of search space expansion; by default the algorithm iterates until no score improvement can be achieved by further expanding the search space
(optional) a square matrix, of dimensions equal to the number of nodes, which defines the search space for the order MCMC in the form of an adjacency matrix; if NULL, the skeleton obtained from the PC-algorithm will be used; if startspace[i,j]
equals to 1 (0) it means that the edge from node i
to node j
is included (excluded) from the search space; to include an edge in both directions, both startspace[i,j]
and startspace[j,i]
should be 1
(optional) a square matrix, of dimensions equal to the number of nodes, which defines edges to exclude from the search space; if blacklist[i,j]
equals to 1 it means that the edge from node i
to node j
is excluded from the search space
"dag", then edges from maximum scoring DAG are added;
"cpdag", then the maximum scoring DAG is first converted to the CPDAG, from which all edges are added to the search space;
"skeleton", then the maximum scoring DAG is first converted to the skeleton, from which all edges are added to the search space
(optional) a square matrix, of dimensions equal to the number of nodes, which defines the edges, which are added at to the search space only at the first iteration of iterative seach and do not necessarily stay afterwards; defined in the form of an adjacency matrix; if addspace[i,j]
equals to 1 (0) it means that the edge from node i
to node j
is included (excluded) from the search space; to include an edge in both directions, both addspace[i,j]
and addspace[j,i]
should be 1
(optional) object of class scorespace
. When not NULL, parameters startspace
and addspace
are ignored.
(optional) numerical, defines alpha that is used by the PC algorithm to learn initial structure of a DBN, ignored in static case
object of class 'iterativeMCMC'
ignored
name of the graph; "iterative MCMC, DAG scores" by default
name of x-axis; "MCMC step"
name of y-axis; "DAG logscore"
type of line in the plot; "l" by default
colour of line in the plot; "blue" by default
object of class 'iterativeMCMC'
Polina Suter, Jack Kuipers
P. Suter, J. Kuipers, G. Moffa, N.Beerenwinkel (2023) <doi:10.18637/jss.v105.i09>
Kuipers J, Super P and Moffa G (2020). Efficient Sampling and Structure Learning of Bayesian Networks. (arXiv:1803.07859v3)
Friedman N and Koller D (2003). A Bayesian approach to structure discovery in bayesian networks. Machine Learning 50, 95-125.
Kalisch M, Maechler M, Colombo D, Maathuis M and Buehlmann P (2012). Causal inference using graphical models with the R package pcalg. Journal of Statistical Software 47, 1-26.
Geiger D and Heckerman D (2002). Parameter priors for directed acyclic graphical models and the characterization of several probability distributions. The Annals of Statistics 30, 1412-1440.
Kuipers J, Moffa G and Heckerman D (2014). Addendum on the scoring of Gaussian directed acyclic graphical models. The Annals of Statistics 42, 1689-1691.
Spirtes P, Glymour C and Scheines R (2000). Causation, Prediction, and Search, 2nd edition. The MIT Press.
if (FALSE) {
Bostonpar<-scoreparameters("bge",Boston)
itfit<-iterativeMCMC(Bostonpar, chainout=TRUE, scoreout=TRUE)
plot(itfit)
}
Run the code above in your browser using DataLab