It is assumed that we have observational data that are multivariate
Gaussian and faithful to the true (but unknown) underlying causal DAG
(without hidden variables). Under these assumptions, this function
estimates the multiset of possible total causal effects of x
on
y
, where the total causal effect is defined via Pearl's
do-calculus as $E(Y|do(X=z+1))-E(Y|do(X=z))$ (this value does not
depend on $z$, since Gaussianity implies that conditional
expectations are linear). We estimate a set of possible total causal effects instead of
the unique total causal effect, since it is typically impossible to
identify the latter when the true underlying causal DAG is unknown
(even with an infinite amount of data). Conceptually, the method
works as follows. First, we estimate the equivalence class of DAGs
that describe the conditional independence relationships in the data,
using the function pc
(see the help file of this
function). For each DAG G in the equivalence class, we apply Pearl's
do-calculus to estimate the total causal effect of x
on
y
. This can be done via a simple linear regression: if y
is not a parent of x
, we take the regression coefficient of
x
in the regression lm(y ~ x + pa(x))
, where
pa(x)
denotes the parents of x
in the DAG G; if y
is a parent of x
in G, we set the estimated causal effect to
zero.
If the equivalence class contains k
DAGs, this will yield
k
estimated total causal effects. Since we do not know which DAG
is the true causal DAG, we do not know which estimated total causal
effect of x
on y
is the correct one. Therefore, we return
the entire multiset of k
estimated effects (it is a multiset
rather than a set because it can contain duplicate values).
One can take summary measures of the multiset. For example, the
minimum absolute value provides a lower bound on the size of the true
causal effect: If the minimum absolute value of all values in the
multiset is larger than one, then we know that the size of the true
causal effect (up to sampling error) must be larger than one.
If method="global"
, the method as described above is carried
out, where all DAGs in the equivalene class of the estimated CPDAG
graphEst
are computed using the function allDags
.
This method is suitable for small graphs (say, up to 10 nodes).
If method="local"
, we do not determine all DAGs in the
equivalence class of the CPDAG. Instead, we only consider the local
neighborhood of x
in the CPDAG. In particular, we consider all
possible directions of undirected edges that have x
as
endpoint, such that no new v-structure is created. For each such
configuration, we estimate the total causal effect of x
on
y
as above, using linear regression.
At first sight, it is not clear that such a local configuration
corresponds to a DAG in the equivalence class of the CPDAG, since it
may be impossible to direct the remaining undirected edges without
creating a directed cycle or a v-structure. However, Maathuis,
Kalisch and Buehlmann (2009) showed that there is at least one DAG in
the equivalence class for each such local configuration. As a result, it follows that the multisets of total causal effects of
the "global" and the "local" method have the same unique values. They
may, however, have different multiplicities.
For example, a CPDAG may represent eight DAGs, and the global method
may produce the multiset {1.3, -0.5, 0.7, 1.3, 1.3, -0.5, 0.7, 0.7}.
The unique values in this set are -0.5, 0.7 and 1.3, and the
multiplicities are 2, 3 and 3. The local method, on the other hand,
may yield {1.3, -0.5, -0.5, 0.7}. The unique values are again -0.5,
0.7 and 1.3, but the multiplicities are now 2, 1 and 1. The fact that
the unique values of the multisets of the "global" and "local" method
are identical implies that summary measures of the multiset that only
depend on the unique values (such as the minimum absolute value) can
be estimate by the local method.