independent.student.spike.slab.prior: Spike and Slab Prior for Regressions with Student T Errors

Description

A spike and slab prior on the parameters of a regression model with Student T errors. The prior assumes independence amon the regression coefficients.

Usage

StudentIndependentSpikeSlabPrior(
    predictor.matrix = NULL,
    response.vector = NULL,
    expected.r2 = .5,
    prior.df = .01,
    expected.model.size = 1,
    prior.beta.sd = NULL,
    optional.coefficient.estimate = NULL,
    mean.y = mean(response.vector, na.rm = TRUE),
    sdy = sd(as.numeric(response.vector), na.rm = TRUE),
    sdx = apply(as.matrix(predictor.matrix), 2, sd, na.rm = TRUE),
    prior.inclusion.probabilities = NULL,
    number.of.observations = nrow(predictor.matrix),
    number.of.variables = ncol(predictor.matrix),
    scale.by.residual.variance = FALSE,
    sigma.upper.limit = Inf,
    degrees.of.freedom.prior = UniformPrior(.1, 100))

Value

An IndependentSpikeSlabPrior with

degrees.of.freedom.prior appended.

Arguments

predictor.matrix: The design matrix for the regression problem. Missing data is not allowed.
response.vector: The vector of responses for the regression. Missing data is not allowed.
expected.r2: The expected R-square for the regression. The spike and slab prior requires an inverse gamma prior on the residual variance of the regression. The prior can be parameterized in terms of a guess at the residual variance, and a "degrees of freedom" representing the number of observations that the guess should weigh. The guess at sigma^2 is set to (1-expected.r2) * var(y) .
prior.df: A positive scalar representing the prior 'degrees of freedom' for estimating the residual variance. This can be thought of as the amount of weight (expressed as an observation count) given to the expected.r2 argument.
expected.model.size: A positive number less than ncol(x), representing a guess at the number of significant predictor p variables. Used to obtain the 'spike' portion of the spike and slab prior.
prior.beta.sd: A vector of positive numbers giving the prior standard deviation of each model coefficient, conditionl on inclusion. If NULL it will be set to 10 * the ratio of sdy / sdx.
optional.coefficient.estimate: If desired, an estimate of the regression coefficients can be supplied. In most cases this will be a difficult parameter to specify. If omitted then a prior mean of zero will be used for all coordinates except the intercept, which will be set to mean(y).
mean.y: The mean of the response vector, for use in cases when specifying the response vector is undesirable.
sdy: The standard deviation of the response vector, for use in cases when specifying the response vector is undesirable.
sdx: The standard deviations to use when scaling the prior sd of each coefficient.
prior.inclusion.probabilities: A vector giving the prior probability of inclusion for each variable.
number.of.observations: The number of observations in the data to be modeled.
number.of.variables: The number of potential predictor variables in the data to be modeled.
scale.by.residual.variance: If TRUE the prior variance is sigma_sq * V, where sigma_sq is the residual variance of the linear regression modeled by this prior. Otherwise the prior variance is V, unscaled.
sigma.upper.limit: The largest acceptable value for the residual standard deviation. A non-positive number is interpreted as Inf.
degrees.of.freedom.prior: An object of class DoubleModel representing the prior distribution for the Student T tail thickness (or "degrees of freedom") parameter.

Author

Steven L. Scott

References

Ghosh and Clyde (2011) "Rao-Blackwellization for Bayesian variable selection and model averaging in linear and binary regression: A novel data augmentation approach", Journal of the American Statistical Association, 106 1041-1052. https://homepage.stat.uiowa.edu/~jghsh/ghosh_clyde_2011_jasa.pdf