ebdbn: Empirical Bayes Dynamic Bayesian Network (EBDBN) Estimation

Description

A function to infer the posterior mean and variance of network parameters using an empirical Bayes estimation procedure for a Dynamic Bayesian Network (DBN).

Usage

ebdbn(input = "feedback", y, K, conv.1 = 0.15, conv.2 = 0.05, 
	conv.3 = 0.01, verbose = TRUE)

Arguments

input

"feedback" for feedback loop networks, or a list of R (MxT) matrices of input profiles

A list of R (PxT) matrices of observed time course profiles

Number of hidden states

conv.1

Value of convergence criterion 1

conv.2

Value of convergence criterion 2

conv.3

Value of convergence criterion 3

verbose

Verbose output

Value

APostPosterior mean of matrix $A$
BPostPosterior mean of matrix $B$
CPostPosterior mean of matrix $C$
DPostPosterior mean of matrix $D$
CvarPostPosterior variance of matrix C
DvarPostPosterior variance of matrix D
xPostPosterior mean of hidden states x
alphaEstEstimated value of $\alpha$
betaEstEstimated value of $\beta$
gammaEstEstimated value of $\gamma$
deltaEstEstimated value of $\delta$
vEstEstimated value of precisions $v$
muEstEstimated value of $\mu$
sigmaEstEstimated value of $\Sigma$
alliterationsTotal number of iterations run

Details

This function infers the parameters of a network, based on the state space model $$x_t = Ax_{t-1} + Bu_t + w_t$$ $$y_t = Cx_t + Du_t + z_t$$ where $x_t$ represents the expression of K hidden states at time $t$, $y_t$ represents the expression of P observed states (e.g., genes) at time $t$, $u_t$ represents the values of M inputs at time $t$, $w_t \sim MVN(0,I)$, and $z_t \sim MVN(0,V^{-1})$, with $V = diag(v_1, \ldots, v_P)$. Note that the dimensions of the matrices $A$, $B$, $C$, and $D$ are (KxK), (KxM), (PxK), and (PxM), respectively. When a network is estimated with feedback rather than inputs (input = "feedback"), the state space model is $$x_t = Ax_{t-1} + By_{t-1} + w_t$$ $$y_t = Cx_t + Dy_{t-1} + z_t$$ The parameters of greatest interest are typically contained in the matrix $D$, which encodes the direct interactions among observed variables from one time to the next (in the case of feedback loops), or the direct interactions between inputs and observed variables at each time point (in the case of inputs). The value of K is chosen prior to running the algorithm by using hankel. The hidden states are estimated using the classic Kalman filter. Posterior distributions of $A$, $B$, $C$, and $D$ are estimated using an empirical Bayes procedure based on a hierarchical Bayesian structure defined over the parameter set. Namely, if $a_{(j)}$, $b_{(j)}$, $c_{(j)}$, $d_{(j)}$, denote vectors made up of the rows of matrices $A$, $B$, $C$, and $D$ respectively, then $$a_{(j)} \vert \alpha \sim N(0, diag(\alpha)^{-1})$$ $$b_{(j)} \vert \beta \sim N(0, diag(\beta)^{-1})$$ $$c_{(j)} \vert \gamma \sim N(0, diag(\gamma)^{-1})$$ $$d_{(j)} \vert \delta \sim N(0, diag(\delta)^{-1})$$ where $\alpha = (\alpha_1, ..., \alpha_K)$, $\beta = (\beta_1, ..., \beta_M)$, $\gamma = (\gamma_1, ..., \gamma_K)$, and $\delta = (\delta_1, ..., \delta_M)$. An EM-like algorithm is used to estimate the hyperparameters in an iterative procedure conditioned on current estimates of the hidden states. conv.1, conv.2, and conv.3 correspond to convergence criteria $\Delta_1$, $\Delta_2$, and $\Delta_3$ in the reference below, respectively. After terminating the algorithm, the z-scores of the $C$ and $D$ matrices can be calculated by inputting CPost and CvarPost or DPost and DvarPost, respectively, into zCutoff. This in turn determines the presence or absence of edges in the network. See the reference below for additional details about the implementation of the algorithm.

References

Andrea Rau, Florence Jaffrezic, Jean-Louis Foulley, and R. W. Doerge (2010). An Empirical Bayesian Method for Estimating Biological Networks from Temporal Microarray Data. Statistical Applications in Genetics and Molecular Biology 9. Article 9.

Examples

Run this code

library(ebdbNet)
tmp <- runif(1) ## Initialize random number generator
set.seed(125214) ## Save seed

## Simulate data
R <- 5
T <- 10
P <- 10
simData <- simFunc(R, T, P, v = rep(10, P), perc = 0.10)
Dtrue <- simData$Dtrue
y <- simData$y

## Simulate 8 inputs
u <- vector("list", R)
M <- 8
for(r in 1:R) {
	u[[r]] <- matrix(rnorm(M*T), nrow = M, ncol = T)
}

####################################################
## Run EB-DBN without hidden states
####################################################
## Choose alternative value of K using hankel if hidden states are to be estimated
## K <- hankel(y)$dim

## Run algorithm	
net <- ebdbn(input = u, y, K = 0, conv.1 = 0.15, conv.2 = 0.10, conv.3 = 0.10)

## Calculate sensitivities, specificities, and precisions of D matrix
## Use z-score significance level of 95%
z <- zCutoff(net$DPost, net$DvarPost)
sens.95 <- sensitivity(Dtrue, z$z95)

Run the code above in your browser using DataLab