ridgePrepEdiag: Ridge penalized estimation of the precision matrix from data with replicates.

Description

Estimation of precision matrices from data with replicates through a ridge penalized EM (Expectation-Maximization) algorithm. It assumes a simple 'signal+noise' model, both random variables are assumed to be drawn from a multivariate normal distribution with their own precision matrix. The signal precision matrix is unstructured, while the former is diagonal. These precision matrices are estimated.

Usage

ridgePrepEdiag(Y, ids, lambdaZ,		
               targetZ=matrix(0, ncol(Y), ncol(Y)),
               nInit=100, minSuccDiff=10^(-10))

Value

The function returns the regularized inverse covariance list-object with slots:

Pz: The estimated signal precision matrix.
Pe: The estimated error precision matrix.
penLL: The penalized loglikelihood of the estimated model.

Arguments

Y: Data matrix with samples (including the repetitions) as rows and variates as columns.
ids: A numeric indicating which rows of Y belong to the same individal.
lambdaZ: A positive numeric representing the ridge penalty parameter for the signal precision matrix estimate.
targetZ: A semi-positive definite target matrix towards which the signal precision matrix estimate is shrunken.
nInit: A numeric specifying the number of iterations.
minSuccDiff: A numeric: minimum successive difference (in terms of the relative change in the absolute difference of the penalized loglikelihood) between two succesive estimates to be achieved.

Author

W.N. van Wieringen.

Details

Data are assumed to originate from a design with replicates. Each observation \(\mathbf{Y}_{i,k_i}\) with \(k_i\) (\(k_i = 1, \ldots, K_i\)) the \(k_i\)-th replicate of the \(i\)-th sample, is described by a `signal+noise' model: \(\mathbf{Y}_{i,k_i} = \mathbf{Z}_i + \boldsymbol{\varepsilon}_{i,k_i}\), where \(\mathbf{Z}_i\) and \(\boldsymbol{\varepsilon}_{i,k_i}\) represent the signal and noise, respectively. Each observation \(\mathbf{Y}_{i,k_i}\) follows a multivariate normal law of the form \(\mathbf{Y}_{i,k_i} \sim \mathcal{N}(\mathbf{0}_p, \boldsymbol{\Omega}_z^{-1} + \boldsymbol{\Omega}_{\varepsilon}^{-1})\), which results from the distributional assumptions of the signal and the noise, \(\mathbf{Z}_{i} \sim \mathcal{N}(\mathbf{0}_p, \boldsymbol{\Omega}_z^{-1})\) and \(\boldsymbol{\varepsilon}_{i, k_i} \sim \mathcal{N}(\mathbf{0}_p, \boldsymbol{\Omega}_{\varepsilon}^{-1})\) with \(\boldsymbol{\Omega}_{\varepsilon}\) diagonal, and their independence. The model parameters are estimated by means of a penalized EM algorithm that maximizes the loglikelihood augmented with the penalty \(\lambda_z \| \boldsymbol{\Omega}_z - \mathbf{T}_z \|_F^2\), in which \(\mathbf{T}_z\) is the shrinkage target of the signal precision matrix. For more details see van Wieringen and Chen (2019).

References

van Wieringen, W.N., Chen, Y. (2021), "Penalized estimation of the Gaussian graphical model from data with replicates", Statistics in Medicine, 40(19), 4279-4293.

Examples

Run this code

# set parameters
p        <- 10
Se       <- diag(runif(p))
Sz       <- matrix(3, p, p)
diag(Sz) <- 4

# draw data
n <- 100
ids <- numeric()
Y   <- numeric()
for (i in 1:n){
     Ki <- sample(2:5, 1)
     Zi <- mvtnorm::rmvnorm(1, sigma=Sz)
     for (k in 1:Ki){
          Y   <- rbind(Y, Zi + mvtnorm::rmvnorm(1, sigma=Se))
          ids <- c(ids, i)
     }
}

# estimate
Ps <- ridgePrepEdiag(Y, ids, 1)

Run the code above in your browser using DataLab