corlatent: Graphical Models with Latent Variables and Correlated Replicates

Description

Estimate graphical models with latent variables and correlated replicates using the method in Jin et al. (2020).

Usage

corlatent(data, accuracy, n, R, p, lambda1, lambda2, lambda3, distribution = "Gaussian",
rule = "AND")

Arguments

data

data set. Can be a matrix, list, array, or data frame. If the data set is a matrix, it should have $nR$ rows and $p$ columns. This matrix is formed by stacking $n$ matrices, and each matrix has $R$ rows and $p$ columns. If the data set is a data frame, the dimention and structure are the same as the matrix. If the data set is an array, its dimention is (R, p, n). If the data set is a list, it should have $n$ elements and each element is a matrix with $R$ rows and $p$ columns.

accuracy

the threshhold where algorithm stops. The algorithm stops when the difference between estimaters of the $(k-1)$th iteration and the $k$th iteration is smaller than the value of accuracy.

the number of observations.

the number of replicates for each observation.

the number of observed variables.

lambda1

tuning parameter that encourages estimated graph to be sparse.

lambda2

tuning parameter that models the effects of correlated replicates. Usually set to be equal to lambda1.

lambda3

tuning parameter that encourages the latent effect to be piecewise constants.

distribution

For a data set with Gaussian distribution, use "Gaussian"; For a data set with Ising distribution, use "Ising". Default is "Gaussian".

rule

rules to combine matrices that encode the conditional dependence relationships between sets of two observed variables. Options are "AND" and "OR". Default is "AND".

Value

omega

a matrix that encodes the conditional dependence relationships between sets of two observed variables

theta

the adjacency matrix with 0 and 1 encoding conditional independence and dependence between sets of two observed variables, respectively

penalties

the penalty values

Details

The corlatent method has two assumptions. Assumption 1 states that the $R$ replicates are assumed to follow a one-lag vector autoregressive model, conditioned on the latent variables. Assumption 2 states that the latent variables are piecewise constant across replicates. Based on these two assumptions, the method solve the following problem for $1 \le j \le p$. $$ \min_{\theta_{j,-j}, \alpha_j, \Delta_j} \{ -\frac{1}{nR}l(\theta_{j,-j}, \alpha_j, \Delta_j) + \lambda\|\theta_{j,-j}\|_1 + \beta\|\alpha_j\|_1 + \gamma\|(I_n \otimes C)\Delta_j\|_1 \}, $$ where $l(\theta_{j,-j}, \alpha_j, \Delta_j)$ is the log likelihood function, $\theta_{j,-j}$ encodes the conditional dependence relationships between $j$th observed variable and the other observed variables, $\alpha_j$ models the correlation among replicates, $\Delta_j$ encodes the latent effect, $\lambda$, $\beta$, $\gamma$ are the tuning parameters, $I_n$ is an n-dimensional identity matrix and $C$ is the discrete first derivative matrix where the $i$th and $(i+1)$th column of every ith row are -1 and 1, respectively. This method aims at modeling exponential family graphical models with correlated replicates and latent variables.

References

Jin, Y., Ning, Y., and Tan, K. M. (2020), `Exponential Family Graphical Models with Correlated Replicates and Unmeasured Confounders', preprint available.

Examples

Run this code

# NOT RUN {
# Gaussian distribution with "AND" rule
n <- 20
R <- 10
p <- 5
l <- 2
s <- 2
seed <- 1

data <- generate_Gaussian(n, R, p, l, s, sparsityA = 0.95, sparsityobserved = 0.9,
sparsitylatent = 0.2, lwb = 0.3, upb = 0.3, seed)$X

result <- corlatent(data, accuracy = 1e-6, n, R, p,lambda1 = 0.1, lambda2 = 0.1,
lambda3 = 1e+5,distribution = "Gaussian", rule = "AND")
# }

Run the code above in your browser using DataLab