Learn R Programming

modelSelection (version 1.0.5)

modelSelectionGGM: Bayesian variable selection for linear models via non-local priors.

Description

Bayesian model selection for linear, asymmetric linear, median and quantile regression under non-local or Zellner priors. p>>n can be handled.

modelSelection enumerates all models when feasible and uses a Gibbs scheme otherwise. See coef and coefByModel for estimates and posterior intervals of regression coefficients, and rnlp for posterior samples.

modelsearchBlockDiag seeks the highest posterior probability model using an iterative block search.

Usage

modelSelectionGGM(y, priorCoef=normalidprior(tau=1), 
priorModel=modelbinomprior(1/ncol(y)), 
priorDiag=exponentialprior(lambda=1), center=TRUE, scale=TRUE, 
global_proposal= 'none', prob_global=0.5,
tempering= 0.5, truncratio= 100,
save_proposal= FALSE, niter=10^3, burnin= round(niter/10), 
updates_per_iter= ceiling(sqrt(ncol(y))), updates_per_column= 10,
sampler='Gibbs', pbirth=0.75, pdeath=0.5*(1-pbirth),
bounds_LIT, Omegaini='glasso-ebic', verbose=TRUE)

Value

Posterior inference on the inverse covariance of y. Object of class msfit_ggm, which extends a list with elements

postSample

Posterior samples for the upper-diagonal entries of the precision matrix. Stored as a sparse matrix, see package Matrix to utilities to work with such matrices

prop_accept

If almost_parallel is TRUE, a vector with the proportion of accepted edge proposals. Note that Omega[,k] is always updated from its exact conditional posterior, regardless of the edge proposal being accepted or rejected.

proposal

If almost_parallel and save_proposal are TRUE, this is a list with one entry per column of Omega, containing the proposed values of each column

proposaldensity

log-proposal density for the samples in proposal. Entry (i,j) stores the log-proposal density for proposed sample i of column j

margpp

Rao-Blackwellized estimates of posterior marginal inclusion probabilities. Only valid when using the Gibbs algorithm

priors

List storing the priors specified when calling modelSelectionGGM

p

Number of columns in y

indexes

Indicates what row/column of Omega is stored in each column of postSample

samplerPars

MCMC sampling parameters

almost_parallel

Stores the input argument almost_parallel

Arguments

y

Data matrix

priorCoef

Prior on off-diagonal entries of the precision matrix, conditional on their not being zero (slab)

priorModel

Prior probabilities on having non-zero diagonal entries

priorDiag

Prior on diagonal entries of the precision matrix

center

If TRUE, the columns of y will be centered to zero mean

scale

If TRUE, the columns of y will be scaled to unit sample variance

global_proposal

Either 'none', 'regression' or 'in-sample'. If 'none', serial MCMC is used as specified by sampler. If 'regression', MCMC uses Metropolis-Hastings where models are proposed for each column using regression posterior model probabilities for that column. If 'in-sample', each column's proposal use posterior model probabilities given a precision matrix estimate for the other columns

prob_global

Probability of proposing a sample from a global proposal, see details. This argument is ignored if global_proposal == "none".

tempering

If global_proposal != 'none', the posterior model probabilities of the proposal distribution are raised to the power indicated by tempering (set to 1 for no tempering)

truncratio

In the global proposal, any model's proposal probability is >= prob(top model) / truncratio. This ensures bounded weight ratios in the MH step, to improve poor mixing when the current state has low proposal probability, often at the cost of decreasing the acceptance rate. If truncratio <= 0, no truncation is done

save_proposal

If TRUE, the global proposals are saved in proposal (a list with p entries, one per column) and the corresponding proposal densities in proposaldensity. Neither are typically needed, as they were already used to produce the posterior samples in postSample

sampler

Posterior sampler used when global=="none", and also to run the parallel proposals when global!="none". Options are "Gibbs" for Gibbs sampling,"birthdeath" for birth-death-swap, and "LIT" for the locally-informed truncated algorithm of Zhou et al (2022)

niter

Number of posterior samples to be obtained. Each iteration consists of selecting a column of the precision matrix at random and making updates_per_column updates to its entries

burnin

The first burnin samples will be discarded

updates_per_iter

An iteration consists of selecting updates_per_iter columns at random, and proposing updates_per_column edge updates within each column

updates_per_column

See updates_per_iter

pbirth

Probability of a birth move. The probability of a swap move is 1-pbirth-pdeath. Ignored unless sampler=="birthdeath"

pdeath

Probability of a death move. Ignored unless sampler=="birthdeath"

bounds_LIT

log-proposal density bound for the locally-informed LIT algorithm of Zhou et al. These bound the proposal density for a death move and for a birth move. By default, bounds_LIT is log(c("lbound_death"=1/p, "ubound_death"= 1, "lbound_birth"=1/p, "ubound_birth"=p))

Omegaini

Initial value of the precision matrix Omega. "null" sets all off-diagonal entries to 0. "glasso-bic" and "glasso-ebic" use GLASSO with regularization parameter set via BIC/EBIC, respectively. Alternatively, Omegaini can be a matrix

verbose

Set verbose==TRUE to print iteration progress

Author

David Rossell

Details

Let Omega be the inverse covariance matrix. A spike-and-slab prior is used. Specifically, independent priors are set on all Omega[j,k], and then a positive-definiteness truncation is added.

The prior on diagonal entries Omega[j,j] is given by priorDiag. Off-diagonal Omega[j,k] are equal to zero with probability given by modelPrior and, when non-zero, they are

Independent spike-and-slab priors are set on the off-diagonal entries of Omega, i.e. Omega[j,k]=0 with positive probability (spike) and otherwise arises from the distribution indicated in priorCoef (slab).

Inference is based on MCMC posterior sampling. All sampling algorithms proceed by updating Omega[,k] given y and Omega[,-k] (of course, Omega[k,] is also set to Omega[,k]). Omega[,k] is updated by first updating the set of non-zero entries (i.e. edges in the graphical model) using either Gibbs sampling or a proposal distribution (see below), and then the non-zero entries of Omega[,k] are updated from their exact posterior given the current set of edges.

An MCMC iteration consists of iterating over updates_per_iter columns chosen at random and, for each column, do updates_per_column proposals.

If global_proposal == "none", a local MCMC proposal is used to update what entries in Omega[,k] are zero. Local means that the proposed model is a neighbour of the current model, e.g. only one edge is added / killed. Available local kernels are Gibbs, birth-death-swap and LIT (Zhou et al 2022).

If global_proposal == "regression", a global proposal is used. A new model (set of non-zero entries in Omega[,k]) is proposed based on the posterior distribution of a linear regression of y[,k] on the other covariates. prob_global indicates the probability of using the global proposal, otherwise a local proposal is used. Proposal probabilities are tempered by raising them to the power tempering. Further, any model with probability below prob(top model) / truncratio is assigned proposal probability prob(top model) / truncratio.

References

Zhou, Quan, et al. Dimension-free mixing for high-dimensional Bayesian variable selection. JRSS-B 2022, 84, 1751-1784

See Also

msfit_ggm-class for further details on the output. icov for the estimated precision (inverse covariance) matrix. coef.msfit_ggm for Bayesian model averaging estimates and intervals.

Examples

Run this code

#Simulate data with p=3
Th= diag(3); Th[1,2]= Th[2,1]= 0.5
sigma= solve(Th)

z= matrix(rnorm(1000*3), ncol=3)
y= z 

#Obtain posterior samples
#y has few columns, initialize the MCMC at the sample precision matrix
#Else, leave the argument Omegaini in modelSelectionGGM empty
Omegaini= solve(cov(y))
fit= modelSelectionGGM(y, scale=FALSE, Omegaini=Omegaini)

#Parameter estimates, intervals, prob of non-zero
coef(fit)

#Estimated inverse covariance
icov(fit)

#Estimated inverse covariance, entries set to 0
icov(fit, threshold=0.95)

#Shows first posterior samples
head(fit$postSample)

Run the code above in your browser using DataLab