udag2pag: Last steps of FCI algorithm: Transform Final Skeleton into FCI-PAG

Description

This function perform the last steps of the FCI algorithm, as it transforms an un-oriented final skeleton into a Partial Ancestral Graph (PAG). The final skeleton must have been estimated with pdsep(). The result is an adjacency matrix indicating also the edge marks.

Usage

udag2pag(pag, sepset, rules = rep(TRUE, 10), unfVect = NULL,
  verbose = FALSE, orientCollider = TRUE)

Arguments

pag

Adjacency matrix of type amat.pag

sepset

List of length p; each element of the list contains another list of length p. The element sepset[[x]][[y]] contains the separation set that made the edge between x and y drop out. Each separation set is a vector with (integer) positions of variables in the adjacency matrix. This object is thought to be obtained from a pcAlgo-object.

rules

Array of length 10 containing TRUE or FALSE for each rule. TRUE in position i means that rule i (Ri) will be applied. By default, all rules are used.

unfVect

Vector containing numbers that encode ambiguous unshielded triples (as returned by pc.cons.intern). This is needed in the conservative and majority rule versions of FCI.

verbose

If TRUE, detailed output is provided.

orientCollider

if TRUE, collider are oriented.

Value

Adjacency matrix of type amat.pag.

Details

The skeleton is transformed into an FCI-PAG using rules by Zhang (2008).

If unfVect = NULL (i.e., one uses standard FCI or one uses conservative/majority rule FCI but there are no ambiguous triples), then the orientation rules are applied to each eligible structure until no more edges can be oriented. On the other hand, if one uses conservative or majority rule FCI and ambiguous triples have been found in pc.cons.intern, unfVect contains the numbers of all ambiguous triples in the graph. In this case, the orientation rules take this information into account. For example, if a *-> b o-* c and <a,b,c> is an unambigous unshielded triple and not a v-structure, then we obtain b -* c (otherwise we would create an additional v-structure). On the other hand, if a *-> b o-* c but <a,b,c> is an ambiguous unshielded triple, then the circle mark at b is not oriented.

Note that the algorithm works with columns' position of the adjacency matrix and not with the names of the variables.

Note that this function does not resolve possible order-dependence in the application of the orientation rules, see Colombo and Maathuis (2014).

References

D. Colombo and M.H. Maathuis (2014).Order-independent constraint-based causal structure learning. Journal of Machine Learning Research 15 3741-3782.

D. Colombo, M. H. Maathuis, M. Kalisch, T. S. Richardson (2012). Learning high-dimensional directed acyclic graphs with latent and selection variables. Ann. Statist. 40, 294--321.

J. Zhang (2008). On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias. Artificial Intelligence 172, 1873--1896.

Examples

Run this code

# NOT RUN {
##################################################
## Example with hidden variables
## Zhang (2008), Fig. 6, p.1882
##################################################

## draw a DAG with latent variables
## this example is taken from Zhang (2008), Fig. 6, p.1882 (see references)
amat <- t(matrix(c(0,1,0,0,1, 0,0,1,0,0, 0,0,0,1,0, 0,0,0,0,0, 0,0,0,1,0),5,5))
V <- as.character(1:5)
colnames(amat) <- rownames(amat) <- V
edL <- vector("list",length=5)
names(edL) <- V
edL[[1]] <- list(edges= c(2,4),weights=c(1,1))
edL[[2]] <- list(edges= 3,     weights=c(1))
edL[[3]] <- list(edges= 5,     weights=c(1))
edL[[4]] <- list(edges= 5,     weights=c(1))
g <- new("graphNEL", nodes=V, edgeL=edL,edgemode="directed")

if(require("Rgraphviz"))  plot(g) else print(g)

## define the latent variable
L <- 1

## compute the true covariance matrix of g
cov.mat <- trueCov(g)

## delete rows and columns which belong to L
true.cov <- cov.mat[-L,-L]

## transform it in a correlation matrix
true.corr <- cov2cor(true.cov)

if (require("MASS")) {
  ## generate 100000 samples of DAG using standard normal error distribution
  n <- 100000
  alpha <- 0.01
  set.seed(314)
  d.mat <- mvrnorm(n, mu = rep(0,dim(true.corr)[1]), Sigma = true.cov)

  ## estimate the skeleton of given data
  suffStat <- list(C = cor(d.mat), n = n)
  indepTest <- gaussCItest
  resD <- skeleton(suffStat, indepTest, p=dim(true.corr)[2], alpha = alpha)

  ## estimate v-structures conservatively
  tmp <- pc.cons.intern(resD, suffStat, indepTest, alpha, version.unf = c(1, 1))
  ## tripleList <- tmp$unfTripl
  resD <- tmp$sk

  ## estimate the final skeleton of given data using Possible-D-Sep
  pdsepRes <- pdsep(resD@graph, suffStat, indepTest, p=dim(true.corr)[2],
		    resD@sepset, alpha = alpha, m.max = Inf,
		    pMax = resD@pMax)

  ## extend the skeleton into a PAG using all 10 rules
  resP <- udag2pag(pag = pdsepRes$G, pdsepRes$sepset, rules = rep(TRUE,10),
		   verbose = TRUE)
  colnames(resP) <- rownames(resP) <- as.character(2:5)
  print(resP)

} # only if "MASS" is there

# }

Run the code above in your browser using DataLab