Learn R Programming

BAS (version 0.80)

bas.lm: Bayesian Adaptive Sampling Without Replacement for Variable Selection in Linear Models

Description

Sample without replacement from a posterior distribution on models

Usage

bas.lm(formula, data, n.models=NULL,  prior="ZS-null", alpha=NULL,
 modelprior=uniform(),
 initprobs="Uniform", random=TRUE, method="BAS", update=NULL,
 bestmodel = NULL, bestmarg = NULL, prob.local = 0,
 Burnin.iterations = NULL, MCMC.iterations = NULL,
 lambda = NULL, delta = 0.025)

Arguments

formula
linear model formula for the full model with all predictors, Y ~ X. All code assumes that an intercept will be included in each model and that the X's will be centered.
data
data frame
n.models
number of models to sample. If NULL, BAS will enumerate unless p > 25
prior
prior distribution for regression coefficients. Choices include "AIC", "BIC", "g-prior", "ZS-null", "ZS-full", "hyper-g", "hyper-g-laplace", "EB-local", and "EB-global"
alpha
optional hyperparameter in g-prior or hyper g-prior. For Zellner's g-prior, alpha = g, for the Liang et al hyper-g method, recommended choice is alpha are between (2, 4), with alpha = 3 recommended.
modelprior
Family of prior distribution on the models. Choices include uniform Bernoulli or beta.binomial
initprobs
vector of length p with the initial inclusion probabilities used for sampling without replacement (the intercwept should be included with probability one) or a character string giving the method used to construct the sampling probabilities
random
Logical variable; if TRUE use random sampling (see method) or deterministic sampling
method
A character variable indicating which sampling method to use: method="BAS" uses Bayesian Adaptive Sampling (without replacement) using the sampling probabilities given in initprobs; method="MCMC+BAS" runs an initial MCMC to calculate margin
update
number of iterations between potential updates of the sampling probabilities. If NULL do not update, otherwise the algorithm will update using the marginal inclusion probabilities as they change while sampling takes place. For large model
bestmodel
optional binary vector representing a model to initialize the sampling. If NULL sampling starts with the null model
bestmarg
optional value for the log marginal associated with the bestmodel
prob.local
An experimental option to allow sampling of models "near" the median probability model. Not recommended for use at this time
Burnin.iterations
Number of iterations to discard when using any of the MCMC options
MCMC.iterations
Number of iterations to run MCMC when MCMC options are used
lambda
Parameter in the AMCMC algorithm.
delta
truncation parameter to prevent sampling probabilities to degenerate to 0 or 1.

Value

  • bas returns an object of class BMA An object of class BMA is a list containing at least the following components:
  • postprobthe posterior probabilities of the models selected
  • priorprobsthe prior probabilities of the models selected
  • namesxthe names of the variables
  • R2R2 values for the models
  • logmargvalues of the log of the marginal likelihood for the models
  • n.varstotal number of independent variables in the full model, including the intercept
  • sizethe number of independent variables in each of the models, includes the intercept
  • whicha list of lists with one list per model with variables that are included in the model
  • probne0the posterior probability that each variable is non-zero
  • olslist of lists with one list per model giving the OLS estimate of each (nonzero) coefficient for each model. The intercept is the mean of Y as each column of X has been centered by subtracting its mean.
  • ols.selist of lists with one list per model giving the OLS standard error of each coefficient for each model
  • priorthe name of the prior that created the BMA object
  • alphavalue of hyperparameter in prior used to create the BMA object.
  • modelpriorthe prior distribution on models that created the BMA object
  • Yresponse
  • Xmatrix of predictors
  • mean.xvector of means for each column of X (used in predict.bma)
  • The function summary.bma, is used to print a summary of the results. The function plot.bma is used to plot posterior distributions for the coefficients and image.bma provides an image of the distribution over models. Posterior summaries of coefficients can be extracted using coefficients.bma. Fitted values and predictions can be obtained using the functions fitted.bma and predict.bma. BMA objects may be updated to use a different prior (without rerunning the sampler) using the function update.bma.

Details

BAS provides two search algorithms to find high probability models for use in Bayesian Model Averaging or Bayesian model selection. For p less than 20-25, BAS can enumerate all models depending on memory availability, for larger p, BAS samples without replacement using random or deteminestic sampling. The Bayesian Adaptive Sampling algorithm of Clyde, Ghosh, Littman (2009) samples models without replacement using the initial sampling probabilities, and will optionally update the sampling probabilities every "update" models using the estimated marginal inclusion probabilties. If the predictor variables are orthogonal the deterinistic sampler provides a list of the top models in order of their approximate posterior probabiity, and provides an effective search if the correlations of variables is small to modest. The priors on coefficients include Zellner's g-prior, the Hyper-g prior (Liang et al 2008, the Zellner-Siow Cauchy prior, Empirical Bayes (local and gobal) g-priors. AIC and BIC are also included.

References

Clyde, M. Ghosh, J. and Littman, M. (2009) Bayesian Adaptive Sampling for Variable Selection and Model Averaging. Department of Statistical Science Discussion Paper 2009-16. Duke University. Clyde, M. and George, E. I. (2004) Model Uncertainty. Statist. Sci., 19, 81-94. http://www.isds.duke.edu/~clyde/papers/statsci.pdf Clyde, M. (1999) Bayesian Model Averaging and Model Search Strategies (with discussion). In Bayesian Statistics 6. J.M. Bernardo, A.P. Dawid, J.O. Berger, and A.F.M. Smith eds. Oxford University Press, pages 157-185. Hoeting, J. A., Madigan, D., Raftery, A. E. and Volinsky, C. T. (1999) Bayesian model averaging: a tutorial (with discussion). Statist. Sci., 14, 382-401. http://www.stat.washington.edu/www/research/online/hoeting1999.pdf Liang, F., Paulo, R., Molina, G., Clyde, M. and Berger, J.O. (2005) Mixtures of g-priors for Bayesian Variable Selection. Journal of the American Statistical Association http://www.stat.duke.edu/05-12.pdf Zellner, A. (1986) On assessing prior distributions and Bayesian regression analysis with g-prior distributions. In Bayesian Inference and Decision Techniques: Essays in Honor of Bruno de Finetti, pp. 233-243. North-Holland/Elsevier. Zellner, A. and Siow, A. (1980) Posterior odds ratios for selected regression hypotheses. In Bayesian Statistics: Proceedings of the First International Meeting held in Valencia (Spain), pp. 585-603.

See Also

summary.bma, coefficients.bma, print.bma, predict.bma, fitted.bma plot.bma, image.bma, eplogprob, update.bma

Examples

Run this code
demo(BAS.hald)
demo(BAS.USCrime)

Run the code above in your browser using DataLab