The method provides Bayesian variable selection for binomial logit models
using mixture priors with a spike and a slab component to identify regressors
with a non-zero effect. More specifically, a Dirac spike is used, i.e. a
point mass at zero and (by default), the slab component is specified as a scale
mixture of normal distributions, resulting in a Student-t distribution with
2psi.nu
degrees of freedom.
In the more general random intercept model, variance selection of the random
intercept is based on the non-centered parameterization of the model, where
the signed standard deviation \(\theta_\alpha\) of the random intercept term
appears as a further regression effect in the model equation.
For details, see Wagner and Duller (2012).
The implementation of Bayesian variable selection further relies on the
representation of the binomial logit model as a Gaussian regression model
in auxiliary variables. Data augmentation is based on Fussl et
al. (2013), who show that the binomial logit model can be represented as a
linear regression model in the latent variable, which has an interpretation as
the difference of aggregated utilities. The error distribution in the auxiliary
model is approximated by a finite scale mixture of normal distributions, where
the mixture parameters are taken from the R package binomlogit
.
See Fussl (2014) for details.
For details concerning the sampling algorithm see Dvorzak and Wagner (2016)
and Wagner and Duller (2012).
Details for the model specification (see arguments):
model
deltafix
an indicator vector of length ncol(X)-1
specifying which regression effects are subject to selection (i.e., 0 =
subject to selection, 1 = fix in the model); defaults to a vector of zeros.
gammafix
an indicator for variance selection of the random
intercept term (i.e., 0 = with variance selection (default), 1 = no
variance selection); only used if a random intercept is includued in the
model (see ri
).
ri
logical. If TRUE
, a cluster-specific
random intercept is included in the model; defaults to FALSE
.
clusterID
a numeric vector of length equal to the number
of observations containing the cluster ID c = 1,...,C for each observation
(required if ri=TRUE
).
prior
slab
distribution of the slab component, i.e. "Student
"
(default) or "Normal
".
psi.nu
hyper-parameter of the Student-t slab (used for a
"Student
" slab); defaults to 5.
m0
prior mean for the intercept parameter; defaults to 0.
M0
prior variance for the intercept parameter; defaults to 100.
aj0
a vector of prior means for the regression effects (which
is encoded in a normal distribution, see notes); defaults to vector of zeros.
V
variance of the slab; defaults to 5.
w
hyper-parameters of the Beta-prior for the mixture weight
\(\omega\); defaults to c(wa0=1, wb0=1)
, i.e. a uniform
distribution.
pi
hyper-parameters of the Beta-prior for the mixture weight
\(\pi\); defaults to c(pa0=1, pb0=1)
, i.e. a uniform
distribution.
mcmc
M
number of MCMC iterations after the burn-in phase; defaults
to 8000.
burnin
number of MCMC iterations discarded as burn-in;
defaults to 2000.
thin
thinning parameter; defaults to 1.
startsel
number of MCMC iterations drawn from the unrestricted
model (e.g., burnin/2
); defaults to 1000.
verbose
MCMC progress report in each verbose
-th
iteration step; defaults to 500. If verbose=0
, no output is
generated.
msave
returns additional output with variable
selection details (i.e. posterior samples for \(\omega\),
\(\delta\), \(\pi\), \(\gamma\)); defaults to FALSE
.