Learn R Programming

BMA (version 3.12)

bic.glm: Bayesian Model Averaging for generalized linear models.

Description

Bayesian Model Averaging accounts for the model uncertainty inherent in the variable selection problem by averaging over the best models in the model class according to approximate posterior model probability.

Usage

bic.glm(x, y, glm.family, wt = rep(1, nrow(x)), strict = FALSE, 
    prior.param = c(rep(0.5, ncol(x))), OR = 20, maxCol = 30, OR.fix = 2, 
    nbest = 150, dispersion = , factor.type = TRUE, 
    factor.prior.adjust = FALSE, occam.window = TRUE, ...)

bic.glm(f, data, glm.family, wt = rep(1, nrow(data)), strict = FALSE, 
    prior.param = c(rep(0.5, ncol(x))), OR = 20, maxCol = 30, OR.fix = 2, 
    nbest = 150, dispersion = , factor.type = TRUE, 
    factor.prior.adjust = FALSE, occam.window = TRUE, ...)

Arguments

Value

  • bic.glm returns an object of class bic.glm The function summary is used to print a summary of the results. The function plot is used to plot posterior distributions for the coefficients. The function imageplot generates an image of the models which were averaged over. An object of class bic.glm is a list containing at least the following components:
  • postprobthe posterior probabilities of the models selected
  • deviancethe estimated model deviances
  • labellabels identifying the models selected
  • bicvalues of BIC for the models
  • sizethe number of independent variables in each of the models
  • whicha logical matrix with one row per model and one column per variable indicating whether that variable is in the model
  • probne0the posterior probability that each variable is non-zero (in percent)
  • postmeanthe posterior mean of each coefficient (from model averaging)
  • postsdthe posterior standard deviation of each coefficient (from model averaging)
  • condpostmeanthe posterior mean of each coefficient conditional on the variable being included in the model
  • condpostsdthe posterior standard deviation of each coefficient conditional on the variable being included in the model
  • mlematrix with one row per model and one column per variable giving the maximum likelihood estimate of each coefficient for each model
  • sematrix with one row per model and one column per variable giving the standard error of each coefficient for each model
  • reduceda logical indicating whether any variables were dropped before model averaging
  • droppeda vector containing the names of those variables dropped before model averaging
  • callthe matched call that created the bma.lm object

synopsis

bic.glm(x, ...)

Details

Bayesian Model Averaging accounts for the model uncertainty inherent in the variable selection problem by averaging over the best models in the model class according to approximate posterior model probability.

References

Raftery, Adrian E. (1995). Bayesian model selection in social research (with Discussion). Sociological Methodology 1995 (Peter V. Marsden, ed.), pp. 111-196, Cambridge, Mass.: Blackwells. An earlier version, issued as Working Paper 94-12, Center for Studies in Demography and Ecology, University of Washington (1994) is available as a Postscript file at http://www.stat.washington.edu/tech.reports/bic.ps

See Also

summary.bic.glm, print.bic.glm, plot.bic.glm

Examples

Run this code
### logistic regression
library("MASS")
data(birthwt)
y<- birthwt$lo
x<- data.frame(birthwt[,-1])
x$race<- as.factor(x$race)
x$ht<- (x$ht>=1)+0
x<- x[,-9]
x$smoke <- as.factor(x$smoke)
x$ptl<- as.factor(x$ptl)
x$ht <- as.factor(x$ht)
x$ui <- as.factor(x$ui)

glm.out.FT<- bic.glm(x, y, strict = FALSE, OR = 20, glm.family="binomial", 
    factor.type=TRUE)
summary(glm.out.FT)
imageplot.bma(glm.out.FT)

glm.out.FF<- bic.glm(x, y, strict = FALSE, OR = 20, glm.family="binomial", 
    factor.type=FALSE)
summary(glm.out.FF)
imageplot.bma(glm.out.FF)

glm.out.TT<- bic.glm(x, y, strict = TRUE, OR = 20, glm.family="binomial", 
    factor.type=TRUE)
summary(glm.out.TT)
imageplot.bma(glm.out.TT)

glm.out.TF<- bic.glm(x, y, strict = TRUE, OR = 20, glm.family="binomial", 
    factor.type=FALSE)
summary(glm.out.TF)
imageplot.bma(glm.out.TF)



### Gamma family 
library(survival)
data(veteran)
surv.t<- veteran$time
x<- veteran[,-c(3,4)]
x$celltype<- factor(as.character(x$celltype))
sel<- veteran$status == 0
x<- x[!sel,]
surv.t<- surv.t[!sel]

glm.out.va<- bic.glm(x, y=surv.t, glm.family=Gamma(link="inverse"),
    factor.type=FALSE)
summary(glm.out.va)
imageplot.bma(glm.out.va)
plot(glm.out.va)



### Poisson family
### Yates (teeth) data. 

x<- rbind(
    c(0, 0, 0),
    c(0, 1, 0),
    c(1, 0, 0),
    c(1, 1, 1))

y<-c(4, 16, 1, 21)
n<-c(1,1,1,1)

models<- rbind(
    c(1, 1, 0),
    c(1, 1, 1))

glm.out.yates <- bic.glm(x,y,n, glm.family = poisson(), factor.type=FALSE) 
summary(glm.out.yates)

### Gaussian
library(MASS)
data(UScrime)
f<- formula(log(y) ~  log(M)+So+log(Ed)+log(Po1)+log(Po2)+log(LF)+log(M.F)+ 
   log(Pop)+log(NW)+log(U1)+log(U2)+log(GDP)+log(Ineq)+log(Prob)+log(Time) )
glm.out.crime<- bic.glm(f, data = UScrime, glm.family = gaussian()) 
summary(glm.out.crime)
# note the problems with the estimation of the posterior standard deviation 
# (compare with bicreg example)

Run the code above in your browser using DataLab