ridgeVAR1fused: Fused ridge ML estimation of multiple VAR(1) model

Description

Ridge penalized maximum likelihood estimation of the parameters of the first-order Vector Auto-Regressive model, with a (possibly) unbalanced experimental set-up. The VAR(1)-process is assumed to have mean zero.

Usage

ridgeVAR1fused(Y, id, lambdaA=0, lambdaF=0, lambdaP=0, 
               targetA=matrix(0, dim(Y)[1], dim(Y)[1]), 
               targetP=matrix(0, dim(Y)[1], dim(Y)[1]), 
               targetPtype="none", fitA="ml", 
               zerosA=matrix(nrow=0, ncol=2), zerosAfit="sparse", 
               zerosP=matrix(nrow=0, ncol=2), cliquesP=list(), 
               separatorsP=list(), unbalanced=matrix(nrow=0, ncol=2), 
               diagP=FALSE, efficient=TRUE, nInit=100, nInitA=5, 
               minSuccDiff=0.001, minSuccDiffA=0.001)

Arguments

Three-dimensional array containing the data. The first, second and third dimensions correspond to covariates, time and samples, respectively. The data are assumed to be centered covariate-wise.

A vector with group indices comprising of integers only. First group is represented by '0', the next by '1', and so on until the last.

lambdaA

Ridge penalty parameter (positive numeric of length 1) to be used in the estimation of the $\mathbf{A}_g$, the matrices with autoregression coefficients.

lambdaF

Fused ridge penalty parameter (positive numeric of length 1) to be used in the estimation of $\mathbf{A}_g$, the matrices with autoregression coefficients.

lambdaP

Ridge penalty parameter (positive numeric of length 1) to be used in the estimation of the inverse error covariance matrix ($\mathbf{\Omega}_{\varepsilon} (=\mathbf{\Sigma_{\varepsilon}^{-1}})$): the precision matrix of the errors.

targetA

Target matrix to which the matrix $\mathbf{A}$ is to be shrunken. This target is shared among the groups (otherwise why fuse?).

targetP

Target matrix to which the in the inverse error covariance matrix, the precision matrix, is to be shrunken.

fitA

A character. If fitA="ml" the parameters $\mathbf{A}_g$ ared estimated by (penalized) maximum likelihood. If fitA="ss" the parameter $\mathbf{A}$ is estimate by (penalized) sum of squares. The latter is much faster.

targetPtype

A character indicating the type of target to be used for the precision matrix. When specified it overrules the targetP-option. See the default.target-function for the options.

zerosA

A matrix with indices of entries of $\mathbf{A}_g$'s that are constrained to zero. The matrix comprises two columns, each row corresponding to an entry of the $\mathbf{A}_s$. The first column contains the row indices and the second the column indices. The support is shared among the groups (otherwise why fuse?).

zerosAfit

A character, either "sparse" or "dense". With "sparse", the matrix $\mathbf{A}$ is assumed to contain many zeros and a computational efficient implementation of its estimation is employed. If "dense", it is assumed that $\mathbf{A}_g$'s contain only few zeros and the estimation method is optimized computationally accordingly.

zerosP

A matrix-object with indices of entries of the precision matrix that are constrained to zero. The matrix comprises two columns, each row corresponding to an entry of the adjacency matrix. The first column contains the row indices and the second the column indices. The specified graph should be undirected and decomposable. If not, it is symmetrized and triangulated (unless cliquesP and seperatorsP are supplied). Hence, the employed zero structure may differ from the input zerosP.

cliquesP

A list-object containing the node indices per clique as object from the rip-function.

separatorsP

A list-object containing the node indices per clique as object from the rip-function.

unbalanced

A matrix with two columns, indicating the unbalances in the design. Each row represents a missing design point in the (time x individual)-layout. The first and second column indicate the time and individual (respectively) specifics of the missing design point.

diagP

A logical, indicates whether the inverse error covariance matrix is assumed to be diagonal.

efficient

A logical, affects estimation of the $\mathbf{A}_g$. Details below.

nInit

Maximum number of iterations (positive numeric of length 1) to be used in maximum likelihood estimation.

nInitA

Maximum number of iterations (positive numeric of length 1) to be used in fused estimation of the autoregression matrices $\mathbf{A}_g$, given the current estimate of $\mathbf{\Omega}_{\varepsilon}$.

minSuccDiff

Minimum distance (positive numeric of length 1) between estimates of two successive iterations to be achieved.

minSuccDiffA

Minimum distance (positive numeric of length 1) between the $\mathbf{A}_g$ estimates of two successive fused estimation iterations to be achieved.

Value

A list-object with slots:

Ridge ML estimates of the matrices $\mathbf{A}_g$, stacked and stored as a single rectangular matrices.

Ridge ML estimate of the inverse error covariance matrix $\mathbf{\Omega}_{\varepsilon} (=\mathbf{\Sigma_{\varepsilon}^{-1}})$.

lambdaA

Positive numeric of length one: ridge penalty used in the estimation of the $\mathbf{A}_g$.

lambdaF

Positive numeric of length one: fused ridge penalty used in the estimation of the $\mathbf{A}_g$.

lambdaP

Positive numeric of length one: ridge penalty used in the estimation of inverse error covariance matrix $\mathbf{\Omega}_{\varepsilon} (=\mathbf{\Sigma_{\varepsilon}^{-1}})$.

Details

If diagP=TRUE, no penalization to estimation of the covariance matrix is applied. Consequently, the arguments lambdaP and targetP are ignored (if supplied).

The ridge ML estimator employs the following estimator of the variance of the VAR(1) process: $$ \frac{1}{n (\mathcal{T} - 1)} \sum_{i=1}^{n} \sum_{t=2}^{\mathcal{T}} \mathbf{Y}_{\ast,i,t} \mathbf{Y}_{\ast,i,t}^{\mathrm{T}}. $$ This is used when efficient=FALSE. However, a more efficient estimator of this variance can be used $$ \frac{1}{n \mathcal{T}} \sum_{i=1}^{n} \sum_{t=1}^{\mathcal{T}} \mathbf{Y}_{\ast,i,t} \mathbf{Y}_{\ast,i,t}^{\mathrm{T}}, $$ which is achieved by setting when efficient=TRUE. Both estimators are adjusted accordingly when dealing with an unbalanced design.

References

Miok, V., Wilting, S.M., Van Wieringen, W.N. (2019), ``Ridge estimation of network models from time-course omics data'', Biometrical Journal, 61(2), 391-405.

Examples

Run this code

# NOT RUN {
# set dimensions (p=covariates, n=individuals, T=time points, G=groups)
p <- 3; n <- 12; T <- 10; G <- 3

# set model parameters
SigmaE <- matrix(1/2, p, p)
diag(SigmaE) <- 1
A1 <- -createA(p, "clique", nCliques=1, nonzeroA=0.1)
A2 <- t(createA(p, "chain", nBands=1, nonzeroA=0.1))
A3 <- (A1 + A2) / 2

# generate data
Y1 <- dataVAR1(n/G, T, A1, SigmaE)
Y2 <- dataVAR1(n/G, T, A2, SigmaE)
Y3 <- dataVAR1(n/G, T, A3, SigmaE)
Y  <- abind::abind(Y1, Y2, Y3, along=3)
id <- c(rep(1, n/G), rep(2, n/G), rep(3, n/G))-1

VAR1hats <- ridgeVAR1fused(Y, id, lambdaA=1, lambdaF=1, lambdaP=1)
# }

Run the code above in your browser using DataLab