lm.spike: Spike and slab regression

Description

MCMC algorithm for linear regression models with a 'spike-and-slab' prior that places some amount of posterior probability at zero for a subset of the regression coefficients.

Usage

lm.spike(formula,
         niter,
         data,
         subset,
         prior = NULL,
         contrasts = NULL,
         drop.unused.levels = TRUE,
         bma.method = c("SSVS", "ODA"),
         oda.options = list(
             fallback.probability = 0.0,
             eigenvalue.fudge.factor = 0.01),
         ping = niter / 10,
         seed = NULL,
         ...)

Arguments

formula

formula for the maximal model (with all variables included), this is parsed the same way as a call to lm.

niter

The number of MCMC iterations to run. Be sure to include enough so you can throw away a burn-in set.

data

an optional data frame, list or environment (or object coercible by 'as.data.frame' to a data frame) containing the variables in the model. If not found in 'data', the variables are taken from 'environment(formula)', typically the environment

subset

an optional vector specifying a subset of observations to be used in the fitting process.

prior

An optional list returned by SpikeSlabPrior. If prior is missing then a default prior will be used. See SpikeSlabPrior.

contrasts

An optional list. See the contrasts.arg argument of model.matrix.default.

drop.unused.levels

Logical indicating whether unobserved factor levels should be dropped from the model.

bma.method

The MCMC method to use. SSVS is the stochastic search variable selection algorithm from George and McCulloch (1998). ODA is the orthogonal data augmentation method from Clyde and Ghosh (2011).

oda.options

If bma.method == "ODA" then these are some options for fine tuning the ODA algorithm.

fallback.probability: Each MCMC iteration will use SSVS instead of ODA with this probability. In cases where the latent data ha

ping

The frequency with which to print status update messages to the screen. For example, if ping == 10 then an update will be printed every 10 MCMC iterations.

seed

An integer to use as the random seed for the underlying C++ code. If NULL then the seed will be set using the clock.

...

Extra arguments to be passed to SpikeSlabPrior (if method == "SSVS") or IndependentSpikeSlabPrior (if met

Value

Returns an object of class lm.spike, which is a list with the following elements
betaA niter by ncol(x) matrix of regression coefficients, many of which may be zero. Each row corresponds to an MCMC iteration.
sigmaA vector of length niter containing the MCMC draws of the residual standard deviation parameter.
priorThe prior used to fit the model. If a prior was supplied as an argument it will be returned. Otherwise this will be the automatically generated prior based on the other function arguments.

Details

Both methods ("DA" and "SSVS") draw each variable inclusion indicator given all others, in a Gibbs sampler. The DA method includes an extra data augmentation step that renders each indicator conditionally independent of the others given the latent data. There is residual dependence between successive MCMC steps introduced by the latent data, but the paper by Ghosh and Clyde suggested that on balance mixing should be improved.

Regarding the overall compute time, the DA method decomposes the (potentially very large) model matrix one time, at the start of the algorithm. But it then works with independent scalar updates. The SSVS algorithm does not have the upfront cost, but it works with many small matrix decompositions each MCMC iteration. The DA algorithm is very likely to be faster in terms of time per iteration.

Finally, note that the two algorithms require slightly different priors. The DA algorithm requires a priori independence, while the SSVS algorithm can work with arbitrary conjugate priors.

References

George and McCulloch (1997), "Approaches to Bayesian Variable Selection", Statistica Sinica, 7, 339 -- 373. http://www3.stat.sinica.edu.tw/statistica/oldpdf/A7n26.pdf

Ghosh and Clyde (2011) "Rao-Blackwellization for Bayesian variable selection and model averaging in linear and binary regression: A novel data augmentation approach", Journal of the American Statistical Association, 106 1041-1052. http://homepage.stat.uiowa.edu/~jghsh/ghosh_clyde_2011_jasa.pdf

Examples

Run this code

n <- 100
  p <- 10
  ngood <- 3
  niter <- 1000
  sigma <- .8

  x <- cbind(1, matrix(rnorm(n * (p-1)), nrow=n))
  beta <- c(rnorm(ngood), rep(0, p - ngood))
  y <- rnorm(n, x %*% beta, sigma)
  x <- x[,-1]
  model <- lm.spike(y ~ x, niter=niter)
  plot.ts(model$beta)
  opar <-  par(ask=TRUE)
  on.exit(par(opar))
  hist(model$sigma)  ## should be near 8
  plot(model)
  summary(model)

Run the code above in your browser using DataLab