bic.glm: Bayesian Model Averaging for generalized linear models.

Description

Bayesian Model Averaging accounts for the model uncertainty inherent in the variable selection problem by averaging over the best models in the model class according to approximate posterior model probability.

Usage

bic.glm(x, y, glm.family, wt = rep(1, nrow(x)), strict = FALSE, 
    prior.param = c(rep(0.5, ncol(x))), OR = 20, maxCol = 30, OR.fix = 2, 
    nbest = 150, dispersion = , factor.type = TRUE, 
    factor.prior.adjust = FALSE, occam.window = TRUE, ...)

bic.glm(f, data, glm.family, wt = rep(1, nrow(data)), strict = FALSE, 
    prior.param = c(rep(0.5, ncol(x))), OR = 20, maxCol = 30, OR.fix = 2, 
    nbest = 150, dispersion = , factor.type = TRUE, 
    factor.prior.adjust = FALSE, occam.window = TRUE, ...)

Arguments

Value

bic.glm returns an object of class bic.glm The function summary is used to print a summary of the results. The function plot is used to plot posterior distributions for the coefficients. The function imageplot generates an image of the models which were averaged over. An object of class bic.glm is a list containing at least the following components:
postprobthe posterior probabilities of the models selected
deviancethe estimated model deviances
labellabels identifying the models selected
bicvalues of BIC for the models
sizethe number of independent variables in each of the models
whicha logical matrix with one row per model and one column per variable indicating whether that variable is in the model
probne0the posterior probability that each variable is non-zero (in percent)
postmeanthe posterior mean of each coefficient (from model averaging)
postsdthe posterior standard deviation of each coefficient (from model averaging)
condpostmeanthe posterior mean of each coefficient conditional on the variable being included in the model
condpostsdthe posterior standard deviation of each coefficient conditional on the variable being included in the model
mlematrix with one row per model and one column per variable giving the maximum likelihood estimate of each coefficient for each model
sematrix with one row per model and one column per variable giving the standard error of each coefficient for each model
reduceda logical indicating whether any variables were dropped before model averaging
droppeda vector containing the names of those variables dropped before model averaging
callthe matched call that created the bma.lm object

synopsis

bic.glm(x, ...)

Details

References

Raftery, Adrian E. (1995). Bayesian model selection in social research (with Discussion). Sociological Methodology 1995 (Peter V. Marsden, ed.), pp. 111-196, Cambridge, Mass.: Blackwells. An earlier version, issued as Working Paper 94-12, Center for Studies in Demography and Ecology, University of Washington (1994) is available as a Postscript file at http://www.stat.washington.edu/tech.reports/bic.ps

Examples

Run this code

### logistic regression
library("MASS")
data(birthwt)
y<- birthwt$lo
x<- data.frame(birthwt[,-1])
x$race<- as.factor(x$race)
x$ht<- (x$ht>=1)+0
x<- x[,-9]
x$smoke <- as.factor(x$smoke)
x$ptl<- as.factor(x$ptl)
x$ht <- as.factor(x$ht)
x$ui <- as.factor(x$ui)

glm.out.FT<- bic.glm(x, y, strict = FALSE, OR = 20, glm.family="binomial", 
    factor.type=TRUE)
summary(glm.out.FT)
imageplot.bma(glm.out.FT)

glm.out.FF<- bic.glm(x, y, strict = FALSE, OR = 20, glm.family="binomial", 
    factor.type=FALSE)
summary(glm.out.FF)
imageplot.bma(glm.out.FF)

glm.out.TT<- bic.glm(x, y, strict = TRUE, OR = 20, glm.family="binomial", 
    factor.type=TRUE)
summary(glm.out.TT)
imageplot.bma(glm.out.TT)

glm.out.TF<- bic.glm(x, y, strict = TRUE, OR = 20, glm.family="binomial", 
    factor.type=FALSE)
summary(glm.out.TF)
imageplot.bma(glm.out.TF)



### Gamma family 
library(survival)
data(veteran)
surv.t<- veteran$time
x<- veteran[,-c(3,4)]
x$celltype<- factor(as.character(x$celltype))
sel<- veteran$status == 0
x<- x[!sel,]
surv.t<- surv.t[!sel]

glm.out.va<- bic.glm(x, y=surv.t, glm.family=Gamma(link="inverse"),
    factor.type=FALSE)
summary(glm.out.va)
imageplot.bma(glm.out.va)
plot(glm.out.va)



### Poisson family
### Yates (teeth) data. 

x<- rbind(
    c(0, 0, 0),
    c(0, 1, 0),
    c(1, 0, 0),
    c(1, 1, 1))

y<-c(4, 16, 1, 21)
n<-c(1,1,1,1)

models<- rbind(
    c(1, 1, 0),
    c(1, 1, 1))

glm.out.yates <- bic.glm(x,y,n, glm.family = poisson(), factor.type=FALSE) 
summary(glm.out.yates)

### Gaussian
library(MASS)
data(UScrime)
f<- formula(log(y) ~  log(M)+So+log(Ed)+log(Po1)+log(Po2)+log(LF)+log(M.F)+ 
   log(Pop)+log(NW)+log(U1)+log(U2)+log(GDP)+log(Ineq)+log(Prob)+log(Time) )
glm.out.crime<- bic.glm(f, data = UScrime, glm.family = gaussian()) 
summary(glm.out.crime)
# note the problems with the estimation of the posterior standard deviation 
# (compare with bicreg example)

Run the code above in your browser using DataLab