ges: Estimate the Markov equivalence class of a DAG using GES

Description

Estimate the observational essential graph representing the Markov equivalence class of a DAG using the greedy equivalence search (GES) algorithm of Chickering (2002).

Usage

ges(p, score, fixedGaps = NULL, 
  turning = TRUE, maxDegree = integer(0), verbose = FALSE, ...)

Arguments

Number of variables.

score

An instance of a class derived from Score which only accounts for observational data.

fixedGaps

logical symmetric matrix of dimension p*p. If entry [i, j] is TRUE, the result is guaranteed to have no edge between nodes $i$ and $j$.

turning

Logical indicating whether the function should try to augment the score by turning edges (cf. details).

maxDegree

Parameter used to limit the vertex degree of the estimated graph. Valid arguments:

Vector of length 0 (default): vertex degree is not limited.
Real number$r$,$0 < r < 1$: degree of vertex$v$is limited to$r \cdot n_v$, wher

verbose

if TRUE, detailed output is provided.

...

Additional arguments for debugging purposes and fine tuning.

Value

ges returns a list with the following two components:
essgraphAn object of class EssGraph containing an estimate of the equivalence class of the underlying DAG.
reprAn object of a class derived from ParDAG containing a (random) representative of the estimated equivalence class.

encoding

UTF-8

concept

greedy equivalence search
essential graph
CPDAG

Details

Under the assumption that the distribution of the observed variables is faithful to a DAG, this function estimates the Markov equivalence class of the DAG. It does not estimate the DAG itself, because this is typically impossible (even with an infinite amount of data): different DAGs (forming a Markov equivalence class) can describe the same conditional independence relationships and be statistically indistinguishable from observational data alone.

All DAGs in an equivalence class have the same skeleton (i.e., the same adjacency information) and the same v-structures (i.e., the same induced subgraphs of the form $a \longrightarrow b \longleftarrow c$). However, the direction of some edges may be undetermined, in the sense that they point one way in one DAG in the equivalence class, while they point the other way in another DAG in the equivalence class.

An equivalence class can be uniquely represented by a partially directed graph called (observational) essential graph or CPDAG (completed partially directed acyclic graph). Its edges have the following interpretation:

a directed edge$a \longrightarrow b$stands for an arrow that has the same orientation in all representatives of the Markov equivalence class;
an undirected edge a -- b stands for an arrow that is oriented in one way in some representatives of the equivalence class and in the other way in other representatives of the equivalence class.

Note that when plotting the object, undirected and bidirected edges are equivalent.

GES (greedy equivalence search) is a score-based algorithm that greedily maximizes a score function (typically the BIC, passed to the function via the argument score) in the space of (observational) essential graphs in three phases, starting from the empty graph: [object Object],[object Object],[object Object] GES cycles through these three phases until no augmentation of the score is possible any more. Note that the turning phase (activated with turning = TRUE, the default behaviour) was not part of the original implementation of Chickering (2002), but was introduced by Hauser and Bühlmann (2012) and shown to improve the overall estimation performance.

GES has the same purpose as the PC algorithm (see pc). While the PC algorithm is based on conditional independence tests (requiring the choice of an independence test and a significance level, see pc), the GES algorithm is a score-based method (requiring the choice of a score function) and does not depend on conditional independence tests. Since GES always operates in the space of essential graphs, it returns a valid essential graph (or CPDAG) in any case.

References

D.M. Chickering (2002). Optimal structure identification with greedy search. Journal of Machine Learning Research 3, 507--554

A. Hauser and P. Bühlmann (2012). Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs. Journal of Machine Learning Research 13, 2409--2464.

P. Spirtes, C.N. Glymour, and R. Scheines (2000). Causation, Prediction, and Search, MIT Press, Cambridge (MA).

Examples

Run this code

## Load predefined data
data(gmG)

## Define the score (BIC)
score <- new("GaussL0penObsScore", gmG8$x)

## Estimate the essential graph
ges.fit <- ges(ncol(gmG8$x), score)

## Plot the estimated essential graph and the true DAG
if (require(Rgraphviz)) {
  par(mfrow=c(1,2))
  plot(ges.fit$essgraph, main = "Estimated CPDAG")
  plot(gmG8$g, main = "True DAG")
} else { ## alternative:
  str(ges.fit, max=2)
  as(as(ges.fit$essgraph,"graphNEL"),"Matrix")
}

Run the code above in your browser using DataLab