Fit the parameters of a Bayesian network conditional on its structure.
bn.fit(x, data, cluster = NULL, method = "mle", …, keep.fitted = TRUE,
debug = FALSE)
custom.fit(x, dist, ordinal, debug = FALSE)
bn.net(x, debug = FALSE)
an object of class bn
(for bn.fit()
and
custom.fit()
) or an object of class bn.fit
(for
bn.net
).
a data frame containing the variables in the model.
an optional cluster object from package parallel.
a named list, with element for each node of x
. See below.
a character string, either mle
for Maximum
Likelihood parameter estimation or bayes
for Bayesian
parameter estimation (currently implemented only for discrete data).
additional arguments for the parameter estimation procedure, see below.
a vector of character strings, the labels of the discrete
nodes which should be saved as ordinal random variables
(bn.fit.onode
) instead of unordered factors (bn.fit.dnode
).
a boolean value. If TRUE
, the object returned by
bn.fit
will contain fitted values and residuals for all Gaussian and
conditional Gaussian nodes, and the configurations of the discrete parents
for conditional Gaussian nodes.
a boolean value. If TRUE
a lot of debugging output is
printed; otherwise the function is completely silent.
bn.fit()
and custom.fit()
returns an object of class
bn.fit
, bn.net()
an object of class bn
. See
bn class
and bn.fit class
for details.
bn.fit()
fits the parameters of a Bayesian network given its structure
and a data set; bn.net
returns the structure underlying a fitted
Bayesian network.
Additional arguments for the bn.fit()
function:
iss
: a numeric value, the imaginary sample size used by the
bayes
method to estimate the conditional probability tables
associated with discrete nodes (see score
for details).
replace.unidentifiable
: a boolean value. If TRUE
and
method
is mle
, unidentifiable parameters are replaced by
zeroes (in the case of regression coefficients and standard errors in
Gaussian and conditional Gaussian nodes) or by uniform conditional
probabilities (in discrete nodes).
If FALSE
(the default), the conditional probabilities in the local
distributions of discrete nodes have a mximum likelihood estimate of
NaN
for all parents configurations that are not observed in
data
. Similarly, regression coefficients are set to NA
if the linear regressions correspoding to the local distributions of
continuous nodes are singular. Such missing values propagate to the
results of functions such as predict()
.
An in-place replacement method is available to change the parameters of each
node in a bn.fit
object; see the examples for discrete, continuous and
hybrid networks below. For a discrete node (class bn.fit.dnode
or
bn.fit.onode
), the new parameters must be in a table
object.
For a Gaussian node (class bn.fit.gnode
), the new parameters can be
defined either by an lm
, glm
or pensim
object (the
latter is from the penalized
package) or in a list with elements named
coef
, sd
and optionally fitted
and resid
. For
a conditional Gaussian node (class bn.fit.cgnode
), the new parameters
can be defined by a list with elements named coef
, sd
and
optionally fitted
, resid
and configs
. In both cases
coef
should contain the new regression coefficients, sd
the
standard deviation of the residuals, fitted
the fitted values and
resid
the residuals. configs
should contain the configurations
if the discrete parents of the conditional Gaussian node, stored as a factor.
custom.fit()
takes a set of user-specified distributions and their
parameters and uses them to build a bn.fit
object. Its purpose is to
specify a Bayesian network (complete with the parameters, not only the
structure) using knowledge from experts in the field instead of learning it
from a data set. The distributions must be passed to the function in a list,
with elements named after the nodes of the network structure x
. Each
element of the list must be in one of the formats described above for
in-place replacement.
# NOT RUN {
data(learning.test)
# learn the network structure.
res = gs(learning.test)
# set the direction of the only undirected arc, A - B.
res = set.arc(res, "A", "B")
# estimate the parameters of the Bayesian network.
fitted = bn.fit(res, learning.test)
# replace the parameters of the node B.
new.cpt = matrix(c(0.1, 0.2, 0.3, 0.2, 0.5, 0.6, 0.7, 0.3, 0.1),
byrow = TRUE, ncol = 3,
dimnames = list(B = c("a", "b", "c"), A = c("a", "b", "c")))
fitted$B = as.table(new.cpt)
# the network structure is still the same.
all.equal(res, bn.net(fitted))
# learn the network structure.
res = hc(gaussian.test)
# estimate the parameters of the Bayesian network.
fitted = bn.fit(res, gaussian.test)
# replace the parameters of the node F.
fitted$F = list(coef = c(1, 2, 3, 4, 5), sd = 3)
# set again the original parameters
fitted$F = lm(F ~ A + D + E + G, data = gaussian.test)
# discrete Bayesian network from expert knowledge.
net = model2network("[A][B][C|A:B]")
cptA = matrix(c(0.4, 0.6), ncol = 2, dimnames = list(NULL, c("LOW", "HIGH")))
cptB = matrix(c(0.8, 0.2), ncol = 2, dimnames = list(NULL, c("GOOD", "BAD")))
cptC = c(0.5, 0.5, 0.4, 0.6, 0.3, 0.7, 0.2, 0.8)
dim(cptC) = c(2, 2, 2)
dimnames(cptC) = list("C" = c("TRUE", "FALSE"), "A" = c("LOW", "HIGH"),
"B" = c("GOOD", "BAD"))
cfit = custom.fit(net, dist = list(A = cptA, B = cptB, C = cptC))
# for ordinal nodes it is nearly the same.
cfit = custom.fit(net, dist = list(A = cptA, B = cptB, C = cptC),
ordinal = c("A", "B"))
# Gaussian Bayesian network from expert knowledge.
distA = list(coef = c("(Intercept)" = 2), sd = 1)
distB = list(coef = c("(Intercept)" = 1), sd = 1.5)
distC = list(coef = c("(Intercept)" = 0.5, "A" = 0.75, "B" = 1.32), sd = 0.4)
cfit = custom.fit(net, dist = list(A = distA, B = distB, C = distC))
# conditional Gaussian Bayesian network from expert knowledge.
cptA = matrix(c(0.4, 0.6), ncol = 2, dimnames = list(NULL, c("LOW", "HIGH")))
distB = list(coef = c("(Intercept)" = 1), sd = 1.5)
distC = list(coef = matrix(c(1.2, 2.3, 3.4, 4.5), ncol = 2,
dimnames = list(c("(Intercept)", "B"), NULL)),
sd = c(0.3, 0.6))
cgfit = custom.fit(net, dist = list(A = cptA, B = distB, C = distC))
# }
Run the code above in your browser using DataLab