
skeleton(suffStat, indepTest, p, alpha, verbose = FALSE, fixedGaps = NULL,
fixedEdges = NULL, NAdelete = TRUE, m.max = Inf)
indepTest
.indepTest(x,y,S,suffStat)
, and
tests conditional independence of x
and y
given
S
. Here, x<
TRUE
, detailed output is provided.[i,j]
or [j,i]
(or both) are TRUE, the edge i-j is
removed before starting the algorithm. Therefore, this edge is
guaranteed to be absent in the resulting graph.[i,j]
or [j,i]
(or both) are TRUE, the edge i-j is
never considered for removal. Therefore, this edge is
guaranteed to be present in the resulting graph.NA
and this option is
TRUE
, the corresponding edge is deleted. If this option is
FALSE
, the edge is not deleted.class
"pcAlgo"
(see
pcAlgo
) containing an estimate of the skeleton of
the underlying DAG,
the conditioning sets that led to edge removals (sepset) and several
other parameters. The data are not required to follow a specific distribution,
but one should make
sure that the conditional indepedence test used in indepTest
is appropriate for the data. Pre-programmed versions of indepTest
are available for Gaussian data (gaussCItest
), discrete data
(disCItest
), and binary data (see binCItest
).
Users can also specify their own indepTest
function.
The PC algorithm (Spirtes, Glymour and Scheines, 2000) starts with a
complete undirected graph. In each step, it visits
all pairs (i
,j
) of adjacent nodes in the current graph, and
determines based on conditional independence tests whether the edge i-j
should be removed. In particular, in step m
(m
=0,1,...)
the algorithm visits all
pairs (i
, j
) of adjacent nodes in the current
graph, and the edge between
i
and j
is kept if and only if the null hypothesis
"i
and j
are
conditionally independent given S" is
rejected at significance level alpha
for all subsets S
of size
m
of the neighbours of i
and of the neighbors of j
in the
current graph (as judged by the function indepTest
).
The algorithm stops when m is larger than the
largest neighbourhood size of all nodes, or when m has reached
the limit m.max
defined by the user.
The information in fixedGaps
and fixedEdges
is used as follows.
The gaps given in fixedGaps
are introduced in the very beginning of
the algorithm by removing the corresponding edges from the complete
undirected graph. Pairs (i
,j
) in fixedEdges
are
skipped in all steps of the algorithm, so that these edges remain in the
graph.
Note: Throughout, the algorithm works with the column positions of the variables in the adjacency matrix, and not with the names of the variables.
M. Kalisch and P. Buehlmann (2007) Estimating high-dimensional directed acyclic graphs with the PC-algorithm, JMLR 8 613-636.
pc
for generating a partially directed graph
using the PC algorithm; udag2pdag
for converting the
skeleton to a CPDAG; gaussCItest
,
disCItest
, binCItest
and
dsepTest
as examples for indepTest
.##################################################
## Using Gaussian Data
##################################################
## Load predefined data
data(gmG)
n <- nrow(gmG$x)
p <- ncol(gmG$x)
## define independence test (partial correlations)
indepTest <- gaussCItest
## define sufficient statistics
suffStat <- list(C = cor(gmG$x), n = n)
## estimate Skeleton
alpha <- 0.01
skeleton.fit <- skeleton(suffStat, indepTest, p, alpha)
if (require(Rgraphviz)) {
## show estimated Skeleton
par(mfrow=c(1,2))
plot(skeleton.fit, main = "Estimated Skeleton")
plot(gmG$g, main = "True DAG")
}
##################################################
## Using d-separation oracle
##################################################
## define independence test
indepTest <- dsepTest
## define sufficient statistics (d-separation oracle)
suffStat <- list(g = gmG$g, jp = RBGL::johnson.all.pairs.sp(gmG$g))
## estimate Skeleton
alpha <- 0.01 ## value is irrelevant as dsepTest returns either 0 or 1
fit <- skeleton(suffStat, indepTest, p, alpha)
if (require(Rgraphviz)) {
## show estimated Skeleton
plot(fit, main = "Estimated Skeleton")
plot(gmG$g, main = "True DAG")
}
##################################################
## Using discrete data
##################################################
## Load data
data(gmD)
p <- ncol(gmD$x)
## define independence test (G^2 statistics)
indepTest <- disCItest
## define sufficient statistics
suffStat <- list(dm = gmD$x, nlev = c(3,2,3,4,2), adaptDF = FALSE)
## estimate Skeleton
alpha <- 0.01
skeleton.fit <- skeleton(suffStat, indepTest, p, alpha, verbose = TRUE)
if (require(Rgraphviz)) {
## show estimated Skeleton
par(mfrow = c(1,2))
plot(skeleton.fit, main = "Estimated Skeleton")
plot(gmD$g, main = "True DAG")
}
##################################################
## Using binary data
##################################################
## Load binary data
data(gmB)
p <- ncol(gmB$x)
## define independence test
indepTest <- binCItest
## define sufficient statistics
suffStat <- list(dm = gmB$x, adaptDF = FALSE)
## estimate Skeleton
alpha <- 0.01
skeleton.fit <- skeleton(suffStat, indepTest, p, alpha, verbose = TRUE)
if (require(Rgraphviz)) {
## show estimated Skeleton
plot(skeleton.fit, main = "Estimated Skeleton")
plot(gmB$g, main = "True DAG")
}
Run the code above in your browser using DataLab