Bayesian inference for multivariate GLMs with group-specific coefficients
that are assumed to be correlated across the GLM submodels.
stan_mvmer(
formula,
data,
family = gaussian,
weights,
prior = normal(autoscale = TRUE),
prior_intercept = normal(autoscale = TRUE),
prior_aux = cauchy(0, 5, autoscale = TRUE),
prior_covariance = lkj(autoscale = TRUE),
prior_PD = FALSE,
algorithm = c("sampling", "meanfield", "fullrank"),
adapt_delta = NULL,
max_treedepth = 10L,
init = "random",
QR = FALSE,
sparse = FALSE,
...
)
A stanmvreg object is returned.
A two-sided linear formula object describing both the
fixed-effects and random-effects parts of the longitudinal submodel
similar in vein to formula specification in the lme4 package
(see glmer
or the lme4 vignette for details).
Note however that the double bar (||
) notation is not allowed
when specifying the random-effects parts of the formula, and neither
are nested grouping factors (e.g. (1 | g1/g2))
or
(1 | g1:g2)
, where g1
, g2
are grouping factors.
For a multivariate GLM this should be a list of such formula objects,
with each element of the list providing the formula for one of the
GLM submodels.
A data frame containing the variables specified in
formula
. For a multivariate GLM, this can
be either a single data frame which contains the data for all
GLM submodels, or it can be a list of data frames where each
element of the list provides the data for one of the GLM submodels.
The family (and possibly also the link function) for the
GLM submodel(s). See glmer
for details.
If fitting a multivariate GLM, then this can optionally be a
list of families, in which case each element of the list specifies the
family for one of the GLM submodels. In other words, a different family
can be specified for each GLM submodel.
Same as in glm
,
except that when fitting a multivariate GLM and a list of data frames
is provided in data
then a corresponding list of weights
must be provided. If weights are
provided for one of the GLM submodels, then they must be provided for
all GLM submodels.
Same as in stan_glmer
except that for a multivariate GLM a list of priors can be provided for
any of prior
, prior_intercept
or prior_aux
arguments.
That is, different priors can optionally be specified for each of the GLM
submodels. If a list is not provided, then the same prior distributions are
used for each GLM submodel. Note that the "product_normal"
prior is
not allowed for stan_mvmer
.
Cannot be NULL
; see priors
for
more information about the prior distributions on covariance matrices.
Note however that the default prior for covariance matrices in
stan_mvmer
is slightly different to that in stan_glmer
(the details of which are described on the priors
page).
A logical scalar (defaulting to FALSE
) indicating
whether to draw from the prior predictive distribution instead of
conditioning on the outcome.
A string (possibly abbreviated) indicating the
estimation approach to use. Can be "sampling"
for MCMC (the
default), "optimizing"
for optimization, "meanfield"
for
variational inference with independent normal distributions, or
"fullrank"
for variational inference with a multivariate normal
distribution. See rstanarm-package
for more details on the
estimation algorithms. NOTE: not all fitting functions support all four
algorithms.
Only relevant if algorithm="sampling"
. See
the adapt_delta help page for details.
A positive integer specifying the maximum treedepth
for the non-U-turn sampler. See the control
argument in
stan
.
The method for generating initial values. See
stan
.
A logical scalar defaulting to FALSE
, but if TRUE
applies a scaled qr
decomposition to the design matrix. The
transformation does not change the likelihood of the data but is
recommended for computational reasons when there are multiple predictors.
See the QR-argument documentation page for details on how
rstanarm does the transformation and important information about how
to interpret the prior distributions of the model parameters when using
QR=TRUE
.
A logical scalar (defaulting to FALSE
) indicating
whether to use a sparse representation of the design (X) matrix.
If TRUE
, the the design matrix is not centered (since that would
destroy the sparsity) and likewise it is not possible to specify both
QR = TRUE
and sparse = TRUE
. Depending on how many zeros
there are in the design matrix, setting sparse = TRUE
may make
the code run faster and can consume much less RAM.
Further arguments passed to the function in the rstan
package (sampling
,
vb
, or
optimizing
),
corresponding to the estimation method named by algorithm
. For example,
if algorithm
is "sampling"
it is possible to specify iter
,
chains
, cores
, and other MCMC controls.
Another useful argument that can be passed to rstan via ...
is
refresh
, which specifies how often to print updates when sampling
(i.e., show the progress every refresh
iterations). refresh=0
turns off the iteration updates.
The stan_mvmer
function can be used to fit a multivariate
generalized linear model (GLM) with group-specific terms. The model consists
of distinct GLM submodels, each which contains group-specific terms; within
a grouping factor (for example, patient ID) the grouping-specific terms are
assumed to be correlated across the different GLM submodels. It is
possible to specify a different outcome type (for example a different
family and/or link function) for each of the GLM submodels.
Bayesian estimation of the model is performed via MCMC, in the same way as
for stan_glmer
. Also, similar to stan_glmer
,
an unstructured covariance matrix is used for the group-specific terms
within a given grouping factor, with priors on the terms of a decomposition
of the covariance matrix.See priors
for more information about
the priors distributions that are available for the covariance matrices,
the regression coefficients and the intercept and auxiliary parameters.
stan_glmer
, stan_jm
,
stanreg-objects
, stanmvreg-methods
,
print.stanmvreg
, summary.stanmvreg
,
posterior_predict
, posterior_interval
.
if (.Platform$OS.type != "windows" || .Platform$r_arch !="i386") {
# \donttest{
#####
# A multivariate GLM with two submodels. For the grouping factor 'id', the
# group-specific intercept from the first submodel (logBili) is assumed to
# be correlated with the group-specific intercept and linear slope in the
# second submodel (albumin)
f1 <- stan_mvmer(
formula = list(
logBili ~ year + (1 | id),
albumin ~ sex + year + (year | id)),
data = pbcLong,
# this next line is only to keep the example small in size!
chains = 1, cores = 1, seed = 12345, iter = 1000)
summary(f1)
#####
# A multivariate GLM with one bernoulli outcome and one
# gaussian outcome. We will artificially create the bernoulli
# outcome by dichotomising log serum bilirubin
pbcLong$ybern <- as.integer(pbcLong$logBili >= mean(pbcLong$logBili))
f2 <- stan_mvmer(
formula = list(
ybern ~ year + (1 | id),
albumin ~ sex + year + (year | id)),
data = pbcLong,
family = list(binomial, gaussian),
chains = 1, cores = 1, seed = 12345, iter = 1000)
# }
}
Run the code above in your browser using DataLab