fct
), linear/polynomial terms
(lin
), uni- or bivariate splines
(sm
, srf
), random intercepts
(rnd
) or Markov random fields
(mrf
) and their interactions, i.e. an
interaction between two smooth terms yields an effect
surface, an interaction between a linear term and a
random intercept yields random slopes, an interaction
between a linear term and a smooth term yields a varying
coefficient term etc.spikeSlabGAM(formula, data, ..., family = "gaussian",
hyperparameters = list(), model = list(),
mcmc = list(), start = list())
ssGAMDesign
).ssGAMDesign
"poisson"
and "binomial"
are implemented as well.spikeAndSlab
.spikeAndSlab
. User-supplied
groupIndicators
and H
entries will be
overwritten by
spikeAndSlab
.spikeAndSlab
. Use
start=list(seed=)
to set the RNG seed
for reproducible results.spikeSlabGAM
with methods
summary.spikeSlabGAM
,
predict.spikeSlabGAM
, and
plot.spikeSlabGAM
.x
are treated as lin(x) +
sm(x)
, factors f
(and numerical covariates
with very few distinct values, see
ssGAMDesign
) are treated as
fct(f)
. Valid model formulas have to
satisfy the following conditions: y ~ x1 +
x1:x2
x2
is missing. u
) and penalized terms are not allowed,
i.e. y ~ u(x1)*x2
will produce an error. y ~ lin(x1) + lin(x2) + x1:x2
will produce an error.The default prior settings for Gaussian data work best for a response with unit variance. If your data is scaled very differently, either rescale the response (recommended) or adjust the hyperparameters accordingly.
Sampling of the chains is done in parallel using package
multicore
if available. If not, a socket cluster
set up with snow
is used where available. Use
options(cores=foo)
to set the (maximal) number of
processes spawned by the parallelization. If
options()$cores
is unspecified, snow uses 8.
set.seed(91179) n <- 400 d <- data.frame(f1 = sample(gl(3, n/3)), f2 = sample(gl(4, n/4)), x1 = runif(n), x2 = runif(n), x3 = runif(n)) # true model: # - interaction between f1 and x1 # - smooth interaction between x1 and x2, # - x3 and f2 are noise variables without influence on y nf1 <- as.numeric(d$f1) d$f <- with(d, 5 * (nf1 + 2 * sin(4 * pi * (x1 - 0.2) * (x2 - 0.7)) - nf1 * x1)) d$y <- with(d, scale(f + rnorm(n))) d$yp <- with(d, rpois(n, exp(f/10)))
# fit & display the model: m1 <- spikeSlabGAM(y ~ x1 * f1 + f1 * f2 + x3 * f1 + x1 * x2, data = d) summary(m1)
# visualize estimates: plot(m1) plot(m1, cumulative = FALSE) (plotTerm("fct(f1):fct(f2)", m1, aggregate = median)) print(p <- plotTerm("sm(x1):sm(x2)", m1, quantiles = c(0.25, 0.75), cumulative = FALSE))
# change MCMC settings and priors: mcmc <- list(nChains = 3, burnin = 100, chainLength = 1000, thin = 3, reduceRet = TRUE) hyper <- list(gamma = c(v0 = 5e-04), tau2 = c(10, 30), w = c(2, 1))
# complicated formula example, poisson response: m2 <- spikeSlabGAM(yp ~ x1 * (x2 + f1) + (x2 + x3 + f2)^2 - sm(x2):sm(x3), data = d, family = "poisson", mcmc = mcmc, hyperparameters = hyper) summary(m2) plot(m2)
# quick&dirty convergence diagnostics: print(b <- ssGAM2Bugs(m2)) plot(b)
spikeSlabGAM
: Bayesian
Variable Selection, Model Choice and Regularization for
Generalized Additive Mixed Models in R. Journal of
Statistical Software, 43(14), 1--24.Fabian Scheipl, Ludwig Fahrmeir, Thomas Kneib (2012). Spike-and-Slab Priors for Function Selection in Structured Additive Regression Models. Journal of the American Statistical Association, 107(500), 1518--1532.
ssGAMDesign
for details on model
specification, spikeAndSlab
for more
details on the MCMC sampler and prior specification, and
ssGAM2Bugs
for MCMC diagnostics. Check out
the vignette for theoretical background and code
examples.