gies: Estimate Interventional Markov Equivalence Class of a DAG by GIES

Description

Estimate the interventional essential graph representing the Markov equivalence class of a DAG using the greedy interventional equivalence search (GIES) algorithm of Hauser and B<U+00FC>hlmann (2012).

Usage

gies(score, labels = score$getNodes(), targets = score$getTargets(),
     fixedGaps = NULL, adaptive = c("none", "vstructures", "triples"), 
     phase = c("forward", "backward", "turning"), iterate = length(phase) > 1,
     turning = NULL, maxDegree = integer(0), verbose = FALSE, ...)

Arguments

score

An R object inheriting from '>Score.

labels

Node labels; by default, they are determined from the scoring object.

targets

A list of intervention targets (cf. details). A list of vectors, each vector listing the vertices of one intervention target.

fixedGaps

Logical symmetric matrix of dimension p*p. If entry [i, j] is TRUE, the result is guaranteed to have no edge between nodes $i$ and $j$ .

adaptive

indicating whether constraints should be adapted to newly detected v-structures or unshielded triples (cf. details).

phase

Character vector listing the phases that should be used; possible values: forward, backward, and turning (cf. details).

iterate

Logical indicating whether the phases listed in the argument phase should be iterated more than once (iterate = TRUE) or not.

turning

Setting turning = TRUE is equivalent to setting phases = c("forward", "backward") and iterate = FALSE; the use of the argument turning is deprecated.

maxDegree

Parameter used to limit the vertex degree of the estimated graph. Possible values:

Vector of length 0 (default): vertex degree is not limited.
Real number $r$ , $0 < r < 1$ : degree of vertex $v$ is limited to $r \cdot n_{v}$ , where $n_{v}$ denotes the number of data points where $v$ was not intervened.
Single integer: uniform bound of vertex degree for all vertices of the graph.
Integer vector of length p: vector of individual bounds for the vertex degrees.

verbose

If TRUE, detailed output is provided.

…

Additional arguments for debugging purposes and fine tuning.

Value

gies returns a list with the following two components:

essgraph

An object of class '>EssGraph containing an estimate of the equivalence class of the underlying DAG.

repr

An object of a class derived from '>ParDAG containing a (random) representative of the estimated equivalence class.

Details

This function estimates the interventional Markov equivalence class of a DAG based on a data sample with interventional data originating from various interventions and possibly observational data. The intervention targets used for data generation must be specified by the argument targets as a list of (integer) vectors listing the intervened vertices; observational data is specified by an empty set, i.e. a vector of the form integer(0). As an example, if data contains observational samples as well as samples originating from an intervention at vertices 1 and 4, the intervention targets must be specified as list(integer(0), as.integer(1), as.integer(c(1, 4))).

An interventional Markov equivalence class of DAGs can be uniquely represented by a partially directed graph called interventional essential graph. Its edges have the following interpretation:

a directed edge $a ⟶ b$ stands for an arrow that has the same orientation in all representatives of the interventional Markov equivalence class;
an undirected edge $a$ -- $b$ stands for an arrow that is oriented in one way in some representatives of the equivalence class and in the other way in other representatives of the equivalence class.

Note that when plotting the object, undirected and bidirected edges are equivalent.

GIES (greedy interventional equivalence search) is a score-based algorithm that greedily maximizes a score function (typically the BIC, passed to the function via the argument score) in the space of interventional essential graphs in three phases, starting from the empty graph:

Forward phase: In the forward phase, GIES moves through the space of interventional essential graphs in steps that correspond to the addition of a single edge in the space of DAGs; the phase is aborted as soon as the score cannot be augmented any more.
Backward phase: In the backward phase, the algorithm performs moves that correspond to the removal of a single edge in the space of DAGs until the score cannot be augmented any more.
Turning phase: In the turning phase, the algorithm performs moves that correspond to the reversal of a single arrow in the space of DAGs until the score cannot be augmented any more.

The phases that are actually run are specified with the argument phase. GIES cycles through the specified phases until no augmentation of the score is possible any more if iterate = TRUE. GIES is an interventional extension of the GES (greedy equivalence search) algorithm of Chickering (2002) which is limited to observational data and hence operates on the space of observational instead of interventional Markov equivalence classes.

Using the argument fixedGaps, one can make sure that certain edges will not be present in the resulting essential graph: if the entry [i, j] of the matrix passed to fixedGaps is TRUE, there will be no edge between nodes $i$ and $j$ . Using this argument can speed up the execution of GIES and allows the user to account for previous knowledge or other constraints. The argument adaptive can be used to relax the constraints encoded by fixedGaps as follows:

When adaptive = "vstructures" and the algorithm introduces a new v-structure $a ⟶ b ⟵ c$ in the forward phase, then the edge $a - c$ is removed from the list of fixed gaps, meaning that the insertion of an edge between $a$ and $c$ becomes possible even if it was forbidden by the initial matrix passed to fixedGaps.
When adaptive = "triples" and the algorithm introduces a new unshielded triple in the forward phase (i.e., a subgraph of three nodes $a$ , $b$ and $c$ where $a$ and $b$ as well as $b$ and $c$ are adjacent, but $a$ and $c$ are not), then the edge $a - c$ is removed from the list of fixed gaps.

This modifications of the forward phase of GIES are inspired by the analog modifications in the forward phase of GES, which makes the successive application of a skeleton estimation method and GES restricted to an estimated skeleton a consistent estimator of the DAG (cf. Nandy, Hauser and Maathuis, 2015).

References

D.M. Chickering (2002). Optimal structure identification with greedy search. Journal of Machine Learning Research 3, 507--554

A. Hauser and P. B<U+00FC>hlmann (2012). Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs. Journal of Machine Learning Research 13, 2409--2464.

P. Nandy, A. Hauser and M. Maathuis (2015). Understanding consistency in hybrid causal structure learning. arXiv preprint 1507.02608

Examples

Run this code

# NOT RUN {
## Load predefined data
data(gmInt)

## Define the score (BIC)
score <- new("GaussL0penIntScore", gmInt$x, gmInt$targets, gmInt$target.index)

## Estimate the essential graph
gies.fit <- gies(score)

## Plot the estimated essential graph and the true DAG
if (require(Rgraphviz)) {
  par(mfrow=c(1,2))
  plot(gies.fit$essgraph, main = "Estimated ess. graph")
  plot(gmInt$g, main = "True DAG")
}
# }

Run the code above in your browser using DataLab

Data Engineering and BI courses are free this week!