fct), linear/polynomial terms
(lin), uni- or bivariate splines
(sm, srf), random intercepts
(rnd) or Markov random fields
(mrf) and their interactions, i.e. an
interaction between two smooth terms yields an effect
surface, an interaction between a linear term and a
random intercept yields random slopes, an interaction
between a linear term and a smooth term yields a varying
coefficient term etc.spikeSlabGAM(formula, data, ..., family = "gaussian",
hyperparameters = list(), model = list(),
mcmc = list(), start = list())ssGAMDesign).ssGAMDesign"poisson"
and "binomial" are implemented as well.spikeAndSlab.spikeAndSlab. User-supplied
groupIndicators and H entries will be
overwritten by spikeAndSlab.spikeAndSlab. Use
start=list(seed=) to set the RNG seed
for reproducible results.spikeSlabGAM with methods
summary.spikeSlabGAM,
predict.spikeSlabGAM, and
plot.spikeSlabGAM.x are treated as lin(x) +
sm(x), factors f (and numerical covariates
with very few distinct values, see
ssGAMDesign) are treated as
fct(f). Valid model formulas have to
satisfy the following conditions: y ~ x1 +
x1:x2x2 is missing. u) and penalized terms are not allowed,
i.e. y ~ u(x1)*x2 will produce an error. y ~ lin(x1) + lin(x2) + x1:x2
will produce an error.The default prior settings for Gaussian data work best for a response with unit variance. If your data is scaled very differently, either rescale the response (recommended) or adjust the hyperparameters accordingly.
Sampling of the chains is done in parallel using package
multicore if available. If not, a socket cluster
set up with snow is used where available. Use
options(cores=foo) to set the (maximal) number of
processes spawned by the parallelization. If
options()$cores is unspecified, snow uses 8.
# next statement necessary to follow CRAN check limitations on # CPU usage (tells multicore to use 2 processes only) options(cores=2)
# fit & display the model: m1 <- spikeSlabGAM(y ~ x1 * f1 + f1 * f2 + x3 * f1 + x1 * x2, data = d) summary(m1)
# visualize estimates: plot(m1) plot(m1, cumulative = FALSE) (plotTerm("fct(f1):fct(f2)", m1, aggregate = median)) print(p <- plotTerm("sm(x1):sm(x2)", m1, quantiles = c(0.25, 0.75), cumulative = FALSE))
# change MCMC settings and priors: mcmc <- list(nChains = 3, burnin = 100, chainLength = 1000, thin = 3, reduceRet = TRUE) hyper <- list(gamma = c(v0 = 5e-04), tau2 = c(10, 30), w = c(2, 1))
# complicated formula example, poisson response: m2 <- spikeSlabGAM(yp ~ x1 * (x2 + f1) + (x2 + x3 + f2)^2 - sm(x2):sm(x3), data = d, family = "poisson", mcmc = mcmc, hyperparameters = hyper) summary(m2) plot(m2)
# quick&dirty convergence diagnostics: print(b <- ssGAM2Bugs(m2)) plot(b)
spikeSlabGAM: Bayesian
Variable Selection, Model Choice and Regularization for
Generalized Additive Mixed Models in R. Journal of
Statistical Software, 43(14), 1--24.
Fabian Scheipl, Ludwig Fahrmeir, Thomas Kneib (2012).
Spike-and-Slab Priors for Function Selection in Structured Additive Regression Models.
Journal of the American Statistical Association, to appear.ssGAMDesign for details on model
specification, spikeAndSlab for more
details on the MCMC sampler and prior specification, and
ssGAM2Bugs for MCMC diagnostics. Check out
the vignette for theoretical background and code
examples.