make.spike.slab.prior: Create a spike and slab prior for use with lm.spike.

Description

Creates a spike and slab prior for use with lm.spike.

Usage

SpikeSlabPrior(x,
               y = NULL,
               expected.r2 = .5,
               prior.df = .01,
               expected.model.size = 1,
               prior.information.weight = .01,
               diagonal.shrinkage = .5,
               optional.coefficient.estimate = NULL,
               max.flips = -1,
               mean.y = mean(y, na.rm = TRUE),
               sdy = sd(as.numeric(y), na.rm = TRUE),
               prior.inclusion.probabilities = NULL,
               sigma.upper.limit = Inf)

Arguments

The design matrix for the regression problem. Missing data is not allowed.

The vector of responses for the regression. Missing data is not allowed. If y is not available, you can pass

y =
      NULL

, and specify mean.y and sdy instead.

expected.r2

The expected R-square for the regression. The spike and slab prior requires an inverse gamma prior on the residual variance of the regression. The prior can be parameterized in terms of a guess at the residual variance, and a "degrees of fre

prior.df

A positive scalar representing the prior 'degrees of freedom' for estimating the residual variance. This can be thought of as the amount of weight (expressed as an observation count) given to the expected.r2 argument.

expected.model.size

A positive number less than ncol(x), representing a guess at the number of significant predictor p variables. Used to obtain the 'spike' portion of the spike and slab prior. The default value leads to pi[i] = .5, wh

prior.information.weight

A positive scalar. Number of observations worth of weight that should be given to the prior estimate of beta.

diagonal.shrinkage

The conditionally Gaussian prior for beta (the "slab") starts with a precision matrix equal to the information in a single observation. However, this matrix might not be full rank. The matrix can be made full rank by averaging with its diagon

optional.coefficient.estimate

If desired, an estimate of the regression coefficients can be supplied. In most cases this will be a difficult parameter to specify. If omitted then a prior mean of zero will be used for all coordinates except the intercept, which will b

max.flips

The maximum number of variable inclusion indicators the sampler will attempt to sample each iteration. If max.flips <= 0<="" code=""> then all indicators will be sampled.

mean.y

The mean of the response vector, for use in cases when specifying the response vector is undesirable.

sdy

The standard deviation of the response vector, for use in cases when specifying the response vector is undesirable.

prior.inclusion.probabilities

A vector giving the prior probability of inclusion for each variable.

sigma.upper.limit

The largest acceptable value for the residual standard deviation. A non-positive number is interpreted as Inf.

Value

A list with with the components necessary to run lm.spike.

References

George and McCulloch (1997), "Approaches to Bayesian Variable Selection", Statistica Sinica, 7, 339 -- 373.

http://www3.stat.sinica.edu.tw/statistica/oldpdf/A7n26.pdf

Examples

Run this code

x <- cbind(1, matrix(rnorm(900), ncol = 9))
  beta <- rep(0, 10)
  beta[1] <- 3
  beta[5] <- -4
  beta[8] <- 2
  y <- rnorm(100, x %*% beta)
  ## x has 10 columns, including the intercept
  prior <- SpikeSlabPrior(x, y,
             expected.model.size = 3,  # expect 3 nonzero predictors
             prior.df = .01,           # weaker prior than the default
             prior.information.weight = .01,
             diagonal.shrinkage = 0,   # use Zellner's prior
             optional.coefficient.estimate = rep(0, 10) # shrink to zero
          )
  ## now 'prior' can be fed to 'lm.spike'
  model <- lm.spike(y ~ x - 1, niter = 1000, prior = prior)

Run the code above in your browser using DataLab