gs
): based on theGrow-Shrink
Markov Blanket, the first (and simplest) Markov blanket detection algorithm
used in a structure learning algorithm.iamb
): based on the
Markov blanket detection algorithm of the same name, which is based on
a two-phase selection scheme (a forward selection followed by an attempt
to remove false positives).fast.iamb
): a
variant of IAMB which uses speculative stepwise forward selection to
reduce the number of conditional independence tests.inter.iamb
):
another variant of IAMB which uses forward stepwise selection to avoid
false positives in the Markov blanket detection phase.This package includes three implementations of each algorithm:
optimized
parameter is set toTRUE
), which uses backtracking to initialize
the learning process of each node.optimized
parameter is set toFALSE
) which is better at uncovering
possible erratic behaviour of the statistical tests.makeCluster
function from theThe computational complexity of these algorithms is polynomial in the number of tests, usually $O(N^2)$ ($O(N^4)$ in the worst case scenario), where $N$ is the number of variables. Execution time scales linearly with the size of the data set.
hc
): ahill climbinggreedy search on the space of the directed graphs. The optimized
implementation uses score caching, score decomposability and score
equivalence to reduce the number of duplicated tests.tabu
): a modified hill
climbing able to escape local optima by selecting a network that
minimally decreases the score function.Random restart with a configurable number of perturbing operations is implemented for both algorithms.
mmhc
): a hybrid
algorithm which combines the Max-Min Parents and Children algorithm
(to restrict the search space) and the Hill-Climbing algorithm
(to find the optimal network structure in the restricted space).rsmax2
): a more
general implementation of the Max-Min Hill-Climbing, which can use
any combination of constraint-based and score-based algorithms.
mmpc
): a
forward selection technique for neighbourhood detection based on
the maximization of the minimum association measure observed with
any subset of the nodes selected in the previous iterations.si.hiton.pc
):
a fast forward selection technique for neighbourhood detection
designed to exclude nodes early based on the marginal association.
The implementation follows the Semi-Interleaved variant of the
algorithm.chow.liu
): an application of the
minimum-weight spanning tree and the information inequality. It
learn the tree structure closest to the true one in the probability
space.aracne
): an improved version of
the Chow-Liu algorithm that is able to learn polytrees.All these algorithms have three implementations (unoptimized, optimized and cluster-aware) like other constraint-based algorithms.
naive.bayes
): a very simple
algorithm assuming that all classifiers are independent and using
the posterior probability of the target variable for classification.tree.bayes
):
a improvement over naive Bayes, this algorithms uses Chow-Liu to
approximate the dependence structure of the classifiers.
mi
andmi-adf
, with adjusted degrees of freedom),
the Monte Carlo permutation test (mc-mi
), the sequential Monte
Carlo permutation test (smc-mi
), and the semiparametric test
(sp-mi
) are implemented.mi-sh
): an improved asymptotic$\chi^2$test
based on the James-Stein estimator for the mutual information.x2
andx2-adf
, with
adjusted degrees of freedom), the Monte Carlo permutation test
(mc-x2
), the sequential Monte Carlo permutation test
(smc-x2
) and semiparametric test (sp-x2
) are implemented.jt
), the Monte Carlo permutation
test (mc-jt
) and the sequential Monte Carlo permutation test
(smc-jt
) are implemented.cor
), the Monte Carlo permutation test
(mc-cor
) and the sequential Monte Carlo permutation test
(smc-cor
) are implemented.pcalg
package on CRAN). The asymptotic normal
test (zf
), the Monte Carlo permutation test (mc-zf
)
and the sequential Monte Carlo permutation test (smc-zf
)
are implemented.mi-g
),the Monte Carlo permutation test (mc-mi-g
)
and the sequential Monte Carlo permutation test (smc-mi-g
)
are implemented.mi-g-sh
): an improved asymptotic$\chi^2$test
based on the James-Stein estimator for the mutual information.
loglik
) score,
which is equivalent to theentropy measureused in Weka.aic
).bic
),
which is equivalent to theMinimum Description Length(MDL)
and is also known asSchwarz Information Criterion.bde
), a score equivalent Dirichlet posterior density.mbde
) for mixtures of experimental and observational
data (not score equivalent).k2
), a Dirichlet
posterior density (not score equivalent).loglik-g
)
score.aic-g
).bic-g
).bge
).
Any arc whitelisted and blacklisted at the same time is assumed to be whitelisted, and is thus removed from the blacklist.
In algorithms that learn undirected graphs, such as ARACNE and Chow-Liu, an arc must be blacklisted in both directions to blacklist the underlying undirected arc.
On the other hand, in the unoptimized implementations of constraint-based algorithms the learning of the Markov blanket and neighbourhood of each node is completely independent from the rest of the learning process. Thus it may happen that the Markov blanket or the neighbourhoods are not symmetric (i.e. A is in the Markov blanket of B but not vice versa), or that some arc directions conflict with each other.
The strict
parameter enables some measure of error correction for
such inconsistencies, which may help to retrieve a good model when the
learning process would otherwise fail:
strict
is set toTRUE
, every error stops the
learning process and results in an error message.strict
is set toFALSE
:Package: bnlearn Type: Package Version: 3.6 Date: 2014-06-17 License: GPLv2 or later
}
This package implements some algorithms for learning the structure of Bayesian networks.
Constraint-based algorithms, also known as conditional independence learners, are all optimized derivatives of the Inductive Causation algorithm (Verma and Pearl, 1991). These algorithms use conditional independence tests to detect the Markov blankets of the variables, which in turn are used to compute the structure of the Bayesian network.
Score-based learning algorithms are general purpose heuristic optimization algorithms which rank network structures with respect to a goodness-of-fit score.
Hybrid algorithms combine aspects of both constraint-based and score-based algorithms, as they use conditional independence tests (usually to reduce the search space) and network scores (to find the optimal network in the reduced space) at the same time.
Several functions for parameter estimation, parametric inference,
bootstrap, cross-validation and stochastic simulation are available.
Furthermore, advanced plotting capabilities are implemented on top
of the
Nagarajan R, Scutari M, Lebre S (2013). "Bayesian Networks in R with Applications in Systems Biology". Springer.
Scutari M (2010). "Learning Bayesian Networks with the bnlearn R Package". Journal of Statistical Software, 35(3), 1-22. URL http://www.jstatsoft.org/v35/i03/.
Koller D, Friedman N (2009). Probabilistic Graphical Models: Principles and Techniques. MIT Press.
Korb K, Nicholson AE (2010). Bayesian Artificial Intelligence. Chapman & Hall/CRC, 2nd edition.
Pearl J (1988). Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann.
library(bnlearn)
data(learning.test)
## Simple learning
# first try the Grow-Shrink algorithm
res = gs(learning.test)
# plot the network structure.
plot(res)
# now try the Incremental Association algorithm.
res2 = iamb(learning.test)
# plot the new network structure.
plot(res2)
# the network structures seem to be identical, don't they?
all.equal(res, res2)
# how many tests each of the two algorithms used?
ntests(res)
ntests(res2)
# and the unoptimized implementation of these algorithms?
ntests(gs(learning.test, optimized = FALSE))
ntests(iamb(learning.test, optimized = FALSE))
## Greedy search
res = hc(learning.test)
plot(res)
## Another simple example (Gaussian data)
data(gaussian.test)
# first try the Grow-Shrink algorithm
res = gs(gaussian.test)
plot(res)
## Blacklist and whitelist use
# the arc B - F should not be there?
blacklist = data.frame(from = c("B", "F"), to = c("F", "B"))
blacklist
res3 = gs(learning.test, blacklist = blacklist)
plot(res3)
# force E - F direction (E -> F).
whitelist = data.frame(from = c("E"), to = c("F"))
whitelist
res4 = gs(learning.test, whitelist = whitelist)
plot(res4)
# use both blacklist and whitelist.
res5 = gs(learning.test, whitelist = whitelist, blacklist = blacklist)
plot(res5)
## Debugging
# use the debugging mode to see the learning algorithms
# in action.
res = gs(learning.test, debug = TRUE)
res = hc(learning.test, debug = TRUE)
# log the learning process for future reference.
sink(file = "learning-log.txt")
res = gs(learning.test, debug = TRUE)
sink()
# if something seems wrong, try the unoptimized version
# in strict mode (inconsistencies trigger errors):
res = gs(learning.test, optimized = FALSE, strict = TRUE, debug = TRUE)
# or disable strict mode to let the algorithm fix errors on the fly:
res = gs(learning.test, optimized = FALSE, strict = FALSE, debug = TRUE)
Run the code above in your browser using DataLab