lmer
.blmer(formula, data, family = NULL, REML = TRUE,
control = list(), start = NULL, verbose = FALSE,
doFit = TRUE, subset, weights, na.action, offset,
contrasts = NULL, model = TRUE, x = TRUE,
cov.prior = "wishart", fixef.prior = "normal",
var.prior = "inverse.gamma",
...)
bglmer(formula, data, family = gaussian, start = NULL,
verbose = FALSE, nAGQ = 1, doFit = TRUE, subset, weights,
na.action, offset, contrasts = NULL, model = TRUE,
control = list(), cov.prior = "wishart",
fixef.prior = "normal", ...)
"none"
, "spectral"
, "correlation"
,
and the direct application of a named distrib"none"
and "normal"
.
Default is "normal"
with parameters as specified in the
"point"
and
"inverse.gamma"
. Default is "inver
TRUE
verbose output is
generated during the optimization of the parameter estimates. In
addition, debugging information about priors is printed out before
optimization.lmer
; see there for details.lm
; see there for
details."bmer "
, for which many methods
are available. See there for details.blmer
and bglmer
closely
follows the functions lmer
and
glmer
. Those help pages provide a good overview of
fitting linear and generalized linear mixed models. The primary
distinction is that blmer
and bglmer
allow the user to
do Bayesian inference, with priors imposed on the different
model components.
Covariance Prior
The cov.prior
argument is applies a prior over the
covariance matrix of the modeled coefficients, or "random effects".
As there is one covariance matrix for every named grouping factor,
that is every element that appears to the right of a vertical bar
("|") in the model formula, it is possible to apply as many
different priors as there are said factors.
The general format of an argument to blmer
or bglmer
for such a prior is of the form:
cov.prior = "factor.name ~ prior.type(type.options)"
If the "factor.name ~
" construct is ommitted, the prior
is interpretted as a default and applied to all factors that
lack specific priors of their own. Options are not required,
but permit fine-tuning of the model.
prior.type
s can be none
, correlation
,
spectral
, and a direct prior of a named distribution.
Respectively, these signify:
none
- no/flat prior, useful to override a defaultcorrelation
- decomposes each covariance matrix
into a diagonal component and a center component, roughly
corresponding to the standard deviations and correlations
of the original matrix. If$D$is diagonal,$R$is
a correlation matrix, then the decomposition is$Sigma = DRD$. As estimation in space is highly constrained,$R$is typically taken to be any positive definite matrix.spectral
- applies priors to a spectral/eigen decomposition
of each covariance matrix, typically with a flat prior over
the orthonormal component and idendependent and identical priors
over the eigenvalues.gamma
,inverse.gamma
,wishart
, orinverse.wishart
family of distributions,
to be applied directly to the covariance matrix, or possibly the
standard deviation in the univiariate case. Type-specific options include the further detailing of the distributions
to be used in a decomposition, as well as the data.scale
setting.
data.scale
can be either 'absolute'
or 'free'
. When
set to absolute, scale (and inverse scale/rate) parameters are interpretted
as they are given. Using data.scale = 'free'
adjusts the scales
of each prior by sd(y) / sd(x)
(or the same with variances, if applicable),
so that they correspond to an absolute prior specified on standardized data.
Exceptions to this rule are that sd(y)
is replaced by $1$ when the
model is not over continuous outcomes, and the same change follows for input
variables that have no variance, like intercepts. The quantity
sd(y) / sd(x)
, with suitable substitions, will be referred to as the
sd.ratio
hereafter.
Further options for correlation
and spectral
types are the
distributional assumptions that are necessary for them to be fully specified.
By default the correlation
type applies independent gamma
priors to each
component of $D$, and a wishart
prior to $R$. The default
for spectral
is to apply independent gamma
priors over the
eigenvalues.
To change these settings, these prior types take options of the form:
coordinate.name ~ univariate.distribution(distribution.options)
In the case of the correlation type, also possible is:
multivariate.distribution(distribution.options)
Coordinate names (or numbers) refer to the components of the covariance
matrix. For example, a model containing (1 + x.1 | g.1)
has
coordinates (Intercept)
and x.1
at grouping factor g.1
.
Eigenvalues do not directly correspond to coordinates as named, so
they must be specified by the number of their position along the diagonal.
Options for different distributions are specified in a named-list style,
and are comprised of:
gamma
-shape
,rate
, andposterior.scale
.
Shape and rate are positive real numbers, while the posterior scale can be'sd'
or'var'
, specifying on which form of the parameter the
prior is to be applied.inverse.gamma
-shape
,scale
, andposterior.scale
. Arguments are similar to thegamma
case.wishart
-df
,scale
. The scale argument must
be a positive definite matrix of the appropriate dimension, a positive
scalar, or vector of positive scalars which is turned into a diagonal
matrix.inverse.wishart
-df
,inverse.scale
. Similar
to thewishart
case.correlation
type, family defaults are:
gamma
-shape = 2
,posterior.scale = 'sd'
.rate
defaults a value that puts the mode at$10^-2$or$10^-1$, forposterior.scale = 'sd'
and'var'
respectively. If the mode does not exist, the mean is used instead.
Unlike the standard behavior, fordata.scale = 'free'
,
the square root ofsd.ratio
orsd.ratio
itself further
divides the rate. The square roots of
a standard deviation is used as for an unconstrained positive definite
matrix for$R$, the full covariance matrix,$DRD$, will have a
diagonal consisting of the squares of two numbers, so that "scales"
will appear four times.inverse.gamma
-shape = 0.5
,posterior.scale = 'sd'
, and thescale
is chosen to set
the mode equal$10^-1$or$10^-2$, depending on the
value ofposterior.scale
as above.wishart
- for a grouping factor of dimension$K$,df = K + 3
. When the mode of the distribution exists, thescale
is chosen to set the mode equal$10^-2$times the
identity matrix. When, the mode does not exist, the mean is
set to that value.inverse.wishart
- for a grouping factor of dimension$K$,df = K - 0.5
.inverse.scale
is chosen so that the mode is
equal to$10^-2$times the identity matrix.spectral
type:
gamma
-shape = 2
,posterior.scale = 'var'
.rate
defaults to setting the mode to$10^-4$or$10^-2$,
depending on whetherposterior.scale = 'var'
or'sd'
,
respectively. If the mode does not exist, the mean is set to that value
instead.inverse.gamma
-shape = 0.5
,posterior.scale = 'var'
, and thescale
is chosen to set
the mode equal to$10^-4$or$10^-2$,
depending on the value ofposterior.scale
.gamma
-shape = 2
,posterior.scale = 'sd'
,
and therate
is chosen so that the mode, when it exists, is equal to$10^-2$($10^-4$forposterior.scale = 'var'
). When
the mode does not exist, the mean is fixed to that value instead.inverse.gamma
-shape = 0.5
,posterior.scale = 'sd'
,
and thescale
is chosen so that the mode is equal to$10^-2$or$10^-4$, forposterior.scale = 'sd'
and'var'
, respectively.wishart
- for a grouping factor of dimension$K$,df = K + 3
.scale
is chosen so that the mode of the distribution is equal to$10^-4$times the identity matrix, whenever it is that the mode exists.
When it does not, the mean is set to that value instead.inverse.wishart
- for a grouping factor of dimension$K$,df = K - 0.5
.inverse.scale
is chosen to set the mode of the
distribution to$10^-4$times the identity matrix.normal
.
It takes as options:
sd
- a single scalar, a vector of length 2, or a vector of
length equal to the number of unmodeled coefficients. Provides the
standard deviations for an assumed to be independent set of Gaussian priors.
All elements must be positive. When a single value is applied, it is
repeated for each unmodeled coefficient. When two values are given, the
first is used only for the intercept term, while the other is repeated.cov
- same as above, but also permits specifying full matrices
of the appropriate dimension. Note thatsd
is on the scale of
standard deviations, whilecov
is that of a variance/covariance.
Must be positive definite, or, for vector form, consistingly entirely of
positive numbers.cov.scale
- can be'absolute'
or'common'
.
Optimization inlmer
is done with all covariances
multiplied by a "common scale factor". When the'common'
type
is used, it is assumed that the specified covariance should also be
multiplied by this factor.data.scale
- either of'absolute'
or'free'
.
The absolute setting interprets standard deviations and covariances
as given, while a prior that is scale-free has a covariance matrix that
is scaled bysd.ratio
squared.sd
to be c(10, 2.5)
, cov.scale
to be
'absolute'
, and data.scale
to be 'free'
.Common Scale Prior
lmer
, glmer
,
mer
class, and lm
.# covariance prior
(fm1 <- blmer(Reaction ~ Days + (Days|Subject), sleepstudy,
cov.prior="gamma"))
(fm2 <- blmer(Reaction ~ Days + (Days|Subject), sleepstudy,
cov.prior="gamma(shape = 2, rate = 0.5, posterior.scale='sd')"))
(fm3 <- blmer(Reaction ~ Days + (1 + Days|Subject), sleepstudy,
cov.prior="wishart"))
(fm4 <- blmer(Reaction ~ Days + (1 + Days|Subject), sleepstudy,
cov.prior="inverse.wishart(df = 5, inverse.scale = diag(0.5, 2))"))
# unmodeled coefficient prior
(fm5 <- blmer(Reaction ~ Days + (1 + Days|Subject), sleepstudy,
cov.prior = "none", fixef.prior="normal"))
(fm6 <- blmer(Reaction ~ Days + (1 + Days|Subject), sleepstudy,
cov.prior = "none",
fixef.prior="normal(cov = diag(0.5, 2), posterior.scale='absolute')"))
# common scale prior
# eight schools example
y <- c(28, 8, -3, 7, -1, 1, 18, 12);
sigma <- c(15, 10, 16, 11, 9, 11, 10, 18);
y.z <- (y - mean(y)) / sigma;
g <- 1:8
model1 <- blmer(y.z ~ 1 + (1 | g), var.prior = "point",
cov.prior = NULL, fixef.prior = NULL);
Run the code above in your browser using DataLab