Learn R Programming

bspcov (version 1.0.3)

bmspcov: Bayesian Sparse Covariance Estimation

Description

Provides a Bayesian sparse and positive definite estimate of a covariance matrix via the beta-mixture shrinkage prior.

Usage

bmspcov(
  X,
  Sigma,
  prior = list(),
  nsample = list(),
  nchain = 1,
  seed = NULL,
  do.parallel = FALSE,
  show_progress = TRUE
)

Value

Sigma

a nmc \(\times\) p(p+1)/2 matrix including lower triangular elements of covariance matrix. For multiple chains, this becomes a list of matrices.

Phi

a nmc \(\times\) p(p+1)/2 matrix including shrinkage parameters corresponding to lower triangular elements of covariance matrix. For multiple chains, this becomes a list of matrices.

p

dimension of covariance matrix.

mcmctime

elapsed time for MCMC sampling. For multiple chains, this becomes a list.

nchain

number of chains used.

burnin

number of burn-in samples discarded.

nmc

number of MCMC samples retained for analysis.

Arguments

X

a n \(\times\) p data matrix with column mean zero.

Sigma

an initial guess for Sigma.

prior

a list giving the prior information. The list includes the following parameters (with default values in parentheses): a (1/2) and b (1/2) giving the shape parameters for beta distribution, lambda (1) giving the hyperparameter for the diagonal elements, tau1sq (10000/(n*p^4)) giving the hyperparameter for the shrinkage prior of covariance.

nsample

a list giving the MCMC parameters. The list includes the following integers (with default values in parentheses): burnin (1000) giving the number of MCMC samples in transition period, nmc (1000) giving the number of MCMC samples for analysis.

nchain

number of MCMC chains to run. Default is 1.

seed

random seed for reproducibility. If NULL, no seed is set. For multiple chains, each chain gets seed + i - 1.

do.parallel

logical indicating whether to run multiple chains in parallel using future.apply. Default is FALSE. When TRUE, automatically sets up a multisession plan with nchain workers if no parallel plan is already configured.

show_progress

logical indicating whether to show progress bars during MCMC sampling. Default is TRUE. Automatically set to FALSE for individual chains when running in parallel.

Author

Kyoungjae Lee, Seongil Jo, and Kyeongwon Lee

Details

Lee, Jo and Lee (2022) proposed the beta-mixture shrinkage prior for estimating a sparse and positive definite covariance matrix. The beta-mixture shrinkage prior for \(\Sigma = (\sigma_{jk})\) is defined as $$ \pi(\Sigma) = \frac{\pi^{u}(\Sigma)I(\Sigma \in C_p)}{\pi^{u}(\Sigma \in C_p)}, ~ C_p = \{\mbox{ all } p \times p \mbox{ positive definite matrices }\}, $$ where \(\pi^{u}(\cdot)\) is the unconstrained prior given by $$ \pi^{u}(\sigma_{jk} \mid \rho_{jk}) = N\left(\sigma_{jk} \mid 0, \frac{\rho_{jk}}{1 - \rho_{jk}}\tau_1^2\right)$$ $$ \pi^{u}(\rho_{jk}) = Beta(\rho_{jk} \mid a, b), ~ \rho_{jk} = 1 - 1/(1 + \phi_{jk})$$ $$ \pi^{u}(\sigma_{jj}) = Exp(\sigma_{jj} \mid \lambda). $$ For more details, see Lee, Jo and Lee (2022).

References

Lee, K., Jo, S., and Lee, J. (2022), "The beta-mixture shrinkage prior for sparse covariances with near-minimax posterior convergence rate", Journal of Multivariate Analysis, 192, 105067.

See Also

sbmspcov

Examples

Run this code

set.seed(1)
n <- 20
p <- 5

# generate a sparse covariance matrix:
True.Sigma <- matrix(0, nrow = p, ncol = p)
diag(True.Sigma) <- 1
Values <- -runif(n = p*(p-1)/2, min = 0.2, max = 0.8)
nonzeroIND <- which(rbinom(n=p*(p-1)/2,1,prob=1/p)==1)
zeroIND = (1:(p*(p-1)/2))[-nonzeroIND]
Values[zeroIND] <- 0
True.Sigma[lower.tri(True.Sigma)] <- Values
True.Sigma[upper.tri(True.Sigma)] <- t(True.Sigma)[upper.tri(True.Sigma)]
if(min(eigen(True.Sigma)$values) <= 0){
  delta <- -min(eigen(True.Sigma)$values) + 1.0e-5
  True.Sigma <- True.Sigma + delta*diag(p)
}

# generate a data
X <- MASS::mvrnorm(n = n, mu = rep(0, p), Sigma = True.Sigma)

# compute sparse, positive covariance estimator:
fout <- bspcov::bmspcov(X = X, Sigma = diag(diag(cov(X))))
post.est.m <- bspcov::estimate(fout)
sqrt(mean((post.est.m - True.Sigma)^2))
sqrt(mean((cov(X) - True.Sigma)^2))

# Multiple chains example with parallel processing:
# fout_multi <- bspcov::bmspcov(X = X, Sigma = diag(diag(cov(X))), 
#                               nchain = 4, do.parallel = TRUE)
# post.est.multi <- bspcov::estimate(fout_multi)
# When do.parallel = TRUE, the function automatically sets up 
# a multisession plan with nchain workers for parallel execution.

Run the code above in your browser using DataLab