ida: Estimate Multiset of Possible Joint Total Causal Effects

Description

ida() estimates the multiset of possible joint total causal effects of variables (X) onto variables (Y) from observational data via adjustment.

Usage

ida(x.pos, y.pos, mcov, graphEst, method = c("local","optimal","global"),
    y.notparent = FALSE, verbose = FALSE, all.dags = NA, type = c("cpdag", "pdag"))

Arguments

x.pos, x

Positions of variables X in the covariance matrix.

y.pos, y

Positions of variables Y in the covariance matrix.

mcov

Covariance matrix that was used to estimate graphEst.

graphEst

Estimated CPDAG or PDAG. The CPDAG is typically from pc(): If the result of pc is pc.fit, the estimated CPDAG can be obtained by pc.fit@graph. A PDAG can be obtained from the CPDAG by adding background knowledge using addBgKnowledge().

method

Character string specifying the method with default "local".

"global":: The algorithm considers all undirected edges in the CPDAG or PDAG, using the possible parents as adjustment sets to estimate all possible causal effects. It is hence slow and can only be applied to singleton X.
"local":: The algorithm only considers edges in the neighborhood of X in the CPDAG or PDAG, using the possible parents as adjustment sets to estimate the unique possible causal effects. It is hence much faster than "global" and can only be applied to singleton X.
"optimal":: The algorithm considers only those edges necessary to compute the possible optimal valid adjustment sets, using these as adjustment sets to estimate the unique possible causal effects. It is hence faster than the "global" option but also slower than "local". It provides more efficient estimates than both alternatives but is also more sensitive to faulty graph estimates. Can be applied to sets X.

See details below.

y.notparent

Logical; for singleton X and Y. If true, any edge between X and Y is assumed to be of the form X->Y. Not implemented for the method="optimal"

verbose

If TRUE, details on the regressions are printed.

all.dags

All DAGs in the equivalence class represented by the CPDAG or PDAG can be precomputed by pdag2allDags() and passed via this argument. In that case, pdag2allDags(..) is not called internally. This option is only relevant when using method="global".

type

Type of graph "graphEst"; can be of type "cpdag" or "pdag" (e.g. a CPDAG with background knowledge from Meek, 1995)

Value

A list of length |Y| of matrices, each containing the possible joint total causal effect of X on one node in Y.

Details

It is assumed that we have observational data from a multivariate Gaussian distribution faithful to the true (but unknown) underlying causal DAG (without hidden variables). Under these assumptions, this function estimates the multiset of possible total joint effects of X on Y. Here the total joint effect of X \(= (X_1,X_2)\) on Y is defined via Pearl's do-calculus as the vector \((E[Y|do(X_1=x_1+1,X_2=x_2)]-E[Y|do(X_1=x_1,X_2=x_2)], E[Y|do(X_1=x_1,X_2=x_2+1)]-E[Y|do(X_1=x_1,X_2=x_2)])\), with a similar definition for more than two variables. These values are equal to the partial derivatives (evaluated at \(x_1,x_2\)) of \(E[Y|do(X=x_1',X_2=x_2')]\) with respect to \(x_1\)' and \(x_2\)'. Moreover, under the Gaussian assumption, these partial derivatives do not depend on the values at which they are evaluated.

We estimate a set of possible joint total causal effects instead of the unique joint total causal effect, since it is typically impossible to identify the latter when the true underlying causal DAG is unknown (even with an infinite amount of data). Conceptually, the method works as follows. First, we estimate the equivalence class of DAGs that describe the conditional independence relationships in the data, using the function pc (see the help file of this function). For each DAG G in the equivalence class, we apply Pearl's do-calculus to estimate the total causal effect of X on Y. This can be done via a simple linear regression adjusting for a valid adjustment set.

For example, if X and Y are singleton and Y is not a parent of X, we can take the regression coefficient of X in the regression lm(Y ~ X + pa(X,G)), where pa(X,G) denotes the parents of X in the DAG G; if Y is a parent of X in G, we can set the estimated causal effect to zero.

If the equivalence class contains k DAGs, this will yield k estimated total causal effects. Since we do not know which DAG is the true causal DAG, we do not know which estimated possible total joint causal effect of X on Y is the correct one. Therefore, we return the entire multiset of k estimated effects (it is a multiset rather than a set because it can contain duplicate values).

One can take summary measures of the multiset. For example, the minimum absolute value provides a lower bound on the size of the true causal effect: If the minimum absolute value of all values in the multiset is larger than one, then we know that the size of the true causal effect (up to sampling error) must be larger than one.

If method="global", the method as described above is carried out, where all DAGs in the equivalene class of the estimated CPDAG or PDAG graphEst are computed using the function pdag2allDags. The parent set for each DAG is then used to estimate the corresponding possible total causal effect. This method is suitable for small graphs (say, up to 10 nodes) and can only be used for singleton X.

If method="local", we only consider all valid possible directions of undirected edges that have X as an endpoint.

In the case of a CPDAG, we consider all possible directions of undirected edges that have X as an endpoint, such that no new v-structure is created. Maathuis, Kalisch and Buehlmann (2009) showed that there is at least one DAG in the equivalence class for each such local configuration. Hence, the procedure is truly local in this setting.

In the case of a PDAG, we need to verify for all possible directions whether they lead to an amenable max. PDAG if we apply Meek's orientation rules. In this setting the complexity of the "local" method is similar to the "optimal" one and it is not truly local. For details see Section 4.2 in Perkovic, Kalisch and Maathuis (2017).

We estimate the total causal effect of X on Y for each valid configuration as above, using linear regression adjusting for the correspoding possible parents. As we adjust for the same sets as in the "global" method, it follows that the multisets of total causal effects of the two methods have the same unique values. They may, however, have different multiplicities.

Since the parents of X are usually an inefficient valid adjustment set we provide a third method, that uses different adjustment sets.

If method="optimal", we do not determine all DAGs in the equivalence class of the CPDAG or PDAG. Instead, we only direct edges until obtaining an amenable PDAG, which is sufficient for computing the optimal valid adjustment set. Each amenable PDAG can be obtained by orienting the neighborhood of X and then applying Meek's orientation rules, similar to the "local" method for PDAGs. This can be done faster than the "global" method but is slower than the "local" method, especially for CPDAGs. For details see Witte, Henckel, Maathuis and Didelez (2019).

For each amenable PDAG the corresponding optimal valid adjustment set is computed. The optimal set is a valid adjustment set irrespectively of whether X is a singleton. Hence, as opposed to the other two, this method can be applied to sets X. Sometimes, however, a joint total causal effect cannot be estimated via adjustment. In these cases we recommend use of the pcalg function jointIda.

We then estimate the joint total causal effect of X on Y for each valid configuration with linear regression, adjusting for the possible optimal sets. If the estimated graph is correct, each of these regressions is guaranteed to be more efficient than the corresponding linear regression with any other valid adjustment set (see Henckel, Perkovic and Maathuis (2019) for more details). The estimates are, however, more sensitive to graph estimation errors than the ones obtained with the other two methods. If X is a singleton, the output of this method is a multiset of the same size as the output of the "local" method.

For example, a CPDAG may represent eight DAGs, and the "global" method may produce an estimate of the multiset of possible total effects {1.3, -0.5, 0.7, 1.3, 1.3, -0.5, 0.7, 0.7}. The unique values in this set are -0.5, 0.7 and 1.3, and the multiplicities are 2, 3 and 3. The "local" and "optimal" methods, on the other hand, may prodcue estimates of the set {1.3, -0.5, -0.5, 0.7}. The unique values are again -0.5, 0.7 and 1.3, but the multiplicities are now 2, 1 and 1. The fact that the unique values of the multisets for all three methods are identical implies that summary measures of the multiset that only depend on the unique values (such as the minimum absolute value) can be estimated with all three.

References

M.H. Maathuis, M. Kalisch, P. Buehlmann (2009). Estimating high-dimensional intervention effects from observational data. Annals of Statistics 37, 3133--3164.

M.H. Maathuis, D. Colombo, M. Kalisch, P. B<U+00FC>hlmann (2010). Predicting causal effects in large-scale systems from observational data. Nature Methods 7, 247--248.

C. Meek (1995). Causal inference and causal explanation with background knowledge, In Proceedings of UAI 1995, 403-410.

Markus Kalisch, Martin Maechler, Diego Colombo, Marloes H. Maathuis, Peter Buehlmann (2012). Causal inference using graphical models with the R-package pcalg. Journal of Statistical Software 47(11) 1--26, https://www.jstatsoft.org/v47/i11/.

Pearl (2005). Causality. Models, reasoning and inference. Cambridge University Press, New York.

E. Perkovic, M. Kalisch and M.H. Maathuis (2017). Interpreting and using CPDAGs with background knowledge. In Proceedings of UAI 2017.

L. Henckel, E. Perkovic and M.H. Maathuis (2019). Graphical criteria for efficient total effect estimation via adjustment in causal linear models. Working Paper.

J. Witte, L. Henckel, M.H Maathuis and V. Didelez (2019). On efficient adjustment in causal graphs. Working Paper.

Examples

Run this code

# NOT RUN {
## Simulate the true DAG
suppressWarnings(RNGversion("3.5.0"))
set.seed(123)
p <- 10
myDAG <- randomDAG(p, prob = 0.2) ## true DAG
myCPDAG <- dag2cpdag(myDAG) ## true CPDAG
myPDAG <- addBgKnowledge(myCPDAG,2,3) ## true PDAG with background knowledge 2 -> 3
covTrue <- trueCov(myDAG) ## true covariance matrix

## simulate Gaussian data from the true DAG
n <- 10000
dat <- rmvDAG(n, myDAG)

## estimate CPDAG and PDAG -- see  help(pc)
suffStat <- list(C = cor(dat), n = n)
pc.fit <- pc(suffStat, indepTest = gaussCItest, p=p, alpha = 0.01)
pc.fit.pdag <- addBgKnowledge(pc.fit@graph,2,3)

if (require(Rgraphviz)) {
  ## plot the true and estimated graphs
  par(mfrow = c(1,3))
  plot(myDAG, main = "True DAG")
  plot(pc.fit, main = "Estimated CPDAG")
  plot(pc.fit.pdag, main = "Max. PDAG")
}

## Supppose that we know the true CPDAG and covariance matrix
(l.ida.cpdag <- ida(3,10, covTrue, myCPDAG, method = "local", type = "cpdag"))
(o.ida.cpdag <- ida(3,10, covTrue, myCPDAG, method = "optimal", type = "cpdag"))
# }
# NOT RUN {
(g.ida.cpdag <- ida(3,10, covTrue, myCPDAG, method = "global", type = "cpdag"))
# }
# NOT RUN {
## All three methods produce the same unique values. 

## Supppose that we know the true PDAG and covariance matrix
(l.ida.pdag <- ida(3,10, covTrue, myPDAG, method = "local", type = "pdag"))
(o.ida.pdag <- ida(3,10, covTrue, myPDAG, method = "optimal", type = "pdag"))
# }
# NOT RUN {
(g.ida.pdag <- ida(3,10, covTrue, myPDAG, method = "global", type = "pdag"))
# }
# NOT RUN {
## All three methods produce the same unique values.

## From the true DAG, we can compute the true causal effect of 3 on 10
(ce.3.10 <- causalEffect(myDAG, 10, 3))
## Indeed, this value is contained in the values found by all methods

## When working with data we have to use the estimated CPDAG and
## the sample covariance matrix
(l.ida.est.cpdag <- ida(3,10, cov(dat), pc.fit@graph, method = "local", type = "cpdag"))
(o.ida.est.cpdag <- ida(3,10, cov(dat), pc.fit@graph, method = "optimal", type = "cpdag"))
# }
# NOT RUN {
(g.ida.est.cpdag <- ida(3,10, cov(dat), pc.fit@graph,
method = "global", type = "cpdag"))
# }
# NOT RUN {
## The unique values of the local and the global method are still identical.
## While not identical, the values of the optimal method are very similar.
## The true causal effect is contained in all three sets, up to a small
## estimation error (0.118 vs. 0.112 with true value 0.114) 

## Similarly, when working with data and background knowledge we have to use the estimated PDAG and
## the sample covariance matrix
(l.ida.est.pdag <- ida(3,10, cov(dat), pc.fit.pdag, method = "local", type = "pdag"))
(o.ida.est.pdag <- ida(3,10, cov(dat), pc.fit.pdag, method = "optimal", type = "pdag"))
# }
# NOT RUN {
(g.ida.est.pdag <- ida(3,10, cov(dat), pc.fit.pdag, method = "global", type = "pdag"))
# }
# NOT RUN {
## The unique values of the local and the global method are still identical.
## While not necessarily identical, the values of the optimal method will be similar.

## The true causal effect is contained in both sets, up to a small estimation error

## All three can also be applied to sets y.
(l.ida.cpdag.2 <- ida(3,c(6,10), cov(dat), pc.fit@graph, method = "local", type = "cpdag"))
(o.ida.cpdag.2 <- ida(3,c(6,10), cov(dat), pc.fit@graph, method = "optimal", type = "cpdag"))
# }
# NOT RUN {
(g.ida.cpdag.2 <- ida(3,c(6,10), cov(dat), pc.fit@graph,
method = "global", type = "cpdag"))
# }
# NOT RUN {
## For the methods local and global we recommend use of idaFast in this case for better performance.

## Note that only the optimal method can be appplied to sets x.
(o.ida.cpdag.2 <- ida(c(2,3),10, cov(dat), pc.fit@graph, method = "optimal", type = "cpdag"))

# }

Run the code above in your browser using DataLab