# set_prior

##### Prior Definitions for brms Models

Define priors for specific parameters or classes of parameters

##### Usage

```
set_prior(prior, class = "b", coef = "", group = "", resp = "",
dpar = "", nlpar = "", lb = NA, ub = NA, check = TRUE)
```prior(prior, ...)

prior_(prior, ...)

prior_string(prior, ...)

##### Arguments

- prior
A character string defining a distribution in Stan language

- class
The parameter class. Defaults to

`"b"`

(i.e. population-level effects). See 'Details' for other valid parameter classes.- coef
Name of the (population- or group-level) parameter.

- group
Grouping factor of group-level parameters.

- resp
Name of the response variable / category. Only used in multivariate models.

- dpar
Name of a distributional parameter. Only used in distributional models.

- nlpar
Name of a non-linear parameter. Only used in non-linear models.

- lb
Lower bound for parameter restriction. Currently only allowed for classes

`"b"`

,`"ar"`

,`"ma"`

, and`"arr"`

. Defaults to`NULL`

, that is no restriction.- ub
Upper bound for parameter restriction. Currently only allowed for classes

`"b"`

,`"ar"`

,`"ma"`

, and`"arr"`

. Defaults to`NULL`

, that is no restriction.- check
Logical; Indicates whether priors should be checked for validity (as far as possible). Defaults to

`TRUE`

. If`FALSE`

,`prior`

is passed to the Stan code as is, and all other arguments are ignored.- ...
Arguments passed to

`set_prior`

.

##### Details

`set_prior`

is used to define prior distributions for parameters
in brms models. The functions `prior`

, `prior_`

, and
`prior_string`

are aliases of `set_prior`

each allowing
for a different kind of argument specification.
`prior`

allows specifying arguments as expression without
quotation marks using non-standard evaluation.
`prior_`

allows specifying arguments as one-sided formulas
or wrapped in `quote`

.
`prior_string`

allows specifying arguments as strings just
as `set_prior`

itself.

Below, we explain its usage and list some common prior distributions for parameters. A complete overview on possible prior distributions is given in the Stan Reference Manual available at http://mc-stan.org/.

To combine multiple priors, use `c(...)`

or the `+`

operator
(see 'Examples'). brms does not check if the priors are written
in correct Stan language. Instead, Stan will check their
syntactical correctness when the model is parsed to `C++`

and
returns an error if they are not.
This, however, does not imply that priors are always meaningful if they are
accepted by Stan. Although brms trys to find common problems
(e.g., setting bounded priors on unbounded parameters), there is no guarantee
that the defined priors are reasonable for the model.
Below, we list the types of parameters in brms models,
for which the user can specify prior distributions.

1. Population-level ('fixed') effects

Every Population-level effect has its own regression parameter
represents the name of the corresponding population-level effect.
Suppose, for instance, that `y`

is predicted by `x1`

and `x2`

(i.e., `y ~ x1 + x2`

in formula syntax).
Then, `x1`

and `x2`

have regression parameters
`b_x1`

and `b_x2`

respectively.
The default prior for population-level effects (including monotonic and
category specific effects) is an improper flat prior over the reals.
Other common options are normal priors or student-t priors.
If we want to have a normal prior with mean 0 and
standard deviation 5 for `x1`

, and a unit student-t prior with 10
degrees of freedom for `x2`

, we can specify this via
`set_prior("normal(0,5)", class = "b", coef = "x1")`

and
`set_prior("student_t(10,0,1)", class = "b", coef = "x2")`

.
To put the same prior on all population-level effects at once,
we may write as a shortcut `set_prior("<prior>", class = "b")`

.
This also leads to faster sampling, because priors can be vectorized in this case.
Both ways of defining priors can be combined using for instance
`set_prior("normal(0,2)", class = "b")`

and
`set_prior("normal(0,10)", class = "b", coef = "x1")`

at the same time. This will set a `normal(0,10)`

prior on
the effect of `x1`

and a `normal(0,2)`

prior
on all other population-level effects.
However, this will break vectorization and
may slow down the sampling procedure a bit.

In case of the default intercept parameterization
(discussed in the 'Details' section of `brmsformula`

),
general priors on class `"b"`

will *not* affect
the intercept. Instead, the intercept has its own parameter class
named `"Intercept"`

and priors can thus be
specified via `set_prior("<prior>", class = "Intercept")`

.
Setting a prior on the intercept will not break vectorization
of the other population-level effects.
Note that technically, this prior is set on an intercept that
results when internally centering all population-level predictors
around zero to improve sampling efficiency. On this centered
intercept, specifying a prior is actually much easier and
intuitive than on the original intercept, since the former
represents the expected response value when all predictors
are at their means. To treat the intercept as an ordinary
population-level effect and avoid the centering parameterization,
use `0 + intercept`

on the right-hand side of the model formula.

A special shrinkage prior to be applied on population-level effects
is the horseshoe prior. See `horseshoe`

for details. Another shrinkage prior is the so-called lasso prior.
See `lasso`

for details.

In non-linear models, population-level effects are defined separately
for each non-linear parameter. Accordingly, it is necessary to specify
the non-linear parameter in `set_prior`

so that priors
we can be assigned correctly.
If, for instance, `alpha`

is the parameter and `x`

the predictor
for which we want to define the prior, we can write
`set_prior("<prior>", coef = "x", nlpar = "alpha")`

.
As a shortcut we can use `set_prior("<prior>", nlpar = "alpha")`

to set the same prior on all population-level effects of `alpha`

at once.

If desired, population-level effects can be restricted to fall only
within a certain interval using the `lb`

and `ub`

arguments
of `set_prior`

. This is often required when defining priors
that are not defined everywhere on the real line, such as uniform
or gamma priors. When defining a `uniform(2,4)`

prior,
you should write `set_prior("uniform(2,4)", lb = 2, ub = 4)`

.
When using a prior that is defined on the positive reals only
(such as a gamma prior) set `lb = 0`

.
In most situations, it is not useful to restrict population-level
parameters through bounded priors
(non-linear models are an important exception),
but if you really want to this is the way to go.

2. Standard deviations of group-level ('random') effects

Each group-level effect of each grouping factor has a standard deviation named
`sd_<group>_<coef>`

. Consider, for instance, the formula
`y ~ x1 + x2 + (1 + x1 | g)`

.
We see that the intercept as well as `x1`

are group-level effects
nested in the grouping factor `g`

.
The corresponding standard deviation parameters are named as
`sd_g_Intercept`

and `sd_g_x1`

respectively.
These parameters are restricted to be non-negative and, by default,
have a half student-t prior with 3 degrees of freedom and a
scale parameter that depends on the standard deviation of the response
after applying the link function. Minimally, the scale parameter is 10.
This prior is used (a) to be only very weakly informative in order to influence
results as few as possible, while (b) providing at least some regularization
to considerably improve convergence and sampling efficiency.
To define a prior distribution only for standard deviations
of a specific grouping factor,
use `set_prior("<prior>", class = "sd", group = "<group>")`

.
To define a prior distribution only for a specific standard deviation
of a specific grouping factor, you may write
`set_prior("<prior>", class = "sd", group = "<group>", coef = "<coef>")`

.
Recommendations on useful prior distributions for
standard deviations are given in Gelman (2006), but note that he
is no longer recommending uniform priors, anymore.

When defining priors on group-level parameters in non-linear models,
please make sure to specify the corresponding non-linear parameter
through the `nlpar`

argument in the same way as
for population-level effects.

3. Correlations of group-level ('random') effects

If there is more than one group-level effect per grouping factor,
the correlations between those effects have to be estimated.
The prior `"lkj_corr_cholesky(eta)"`

or in short
`"lkj(eta)"`

with `eta > 0`

is essentially the only prior for (Cholesky factors) of correlation matrices.
If `eta = 1`

(the default) all correlations matrices
are equally likely a priori. If `eta > 1`

, extreme correlations
become less likely, whereas `0 < eta < 1`

results in
higher probabilities for extreme correlations.
Correlation matrix parameters in `brms`

models are named as
`cor_<group>`

, (e.g., `cor_g`

if `g`

is the grouping factor).
To set the same prior on every correlation matrix,
use for instance `set_prior("lkj(2)", class = "cor")`

.
Internally, the priors are transformed to be put on the Cholesky factors
of the correlation matrices to improve efficiency and numerical stability.
The corresponding parameter class of the Cholesky factors is `L`

,
but it is not recommended to specify priors for this parameter class directly.

4. Splines

Splines are implemented in brms using the 'random effects'
formulation as explained in `gamm`

).
Thus, each spline has its corresponding standard deviations
modeling the variability within this term. In brms, this
parameter class is called `sds`

and priors can
be specified via ```
set_prior("<prior>", class = "sds",
coef = "<term label>")
```

. The default prior is the same as
for standard deviations of group-level effects.

5. Gaussian processes

Gaussian processes as currently implemented in brms have
two parameters, the standard deviation parameter `sdgp`

,
and characteristic length-scale parameter `lscale`

(see `gp`

for more details). The default prior
of `sdgp`

is the same as for standard deviations of
group-level effects. The default prior of `lscale`

is an informative inverse-gamma prior specifically tuned
to the covariates of the Gaussian process (for more details see
https://betanalpha.github.io/assets/case_studies/gp_part3/part3.html).
This tuned prior may be overly informative in some cases, so please
consider other priors as well to make sure inference is
robust to the prior specification. If tuning fails, a half-normal prior
is used instead.

6. Autocorrelation parameters

The autocorrelation parameters currently implemented are named
`ar`

(autoregression), `ma`

(moving average),
`arr`

(autoregression of the response), `car`

(spatial conditional autoregression), as well as `lagsar`

and `errorsar`

(Spatial simultaneous autoregression).

Priors can be defined by `set_prior("<prior>", class = "ar")`

for `ar`

and similar for other autocorrelation parameters.
By default, `ar`

and `ma`

are bounded between `-1`

and `1`

, `car`

, `lagsar`

, and `errorsar`

are
bounded between `0`

, and `1`

, and `arr`

is unbounded
(you may change this by using the arguments `lb`

and `ub`

).
The default prior is flat over the definition area.

7. Distance parameters of monotonic effects

As explained in the details section of `brm`

,
monotonic effects make use of a special parameter vector to
estimate the 'normalized distances' between consecutive predictor
categories. This is realized in Stan using the `simplex`

parameter type. This class is named `"simo"`

(short for
simplex monotonic) in brms.
The only valid prior for simplex parameters is the
dirichlet prior, which accepts a vector of length `K - 1`

(K = number of predictor categories) as input defining the
'concentration' of the distribution. Explaining the dirichlet prior
is beyond the scope of this documentation, but we want to describe
how to define this prior syntactically correct.
If a predictor `x`

with `K`

categories is modeled as monotonic,
we can define a prior on its corresponding simplex via
`prior(dirichlet(<vector>), class = simo, coef = mox1)`

.
The `1`

in the end of `coef`

indicates that this is the first
simplex in this term. If interactions between multiple monotonic
variables are modeled, multiple simplexes per term are required.
For `<vector>`

, we can put in any `R`

expression
defining a vector of length `K - 1`

. The default is a uniform
prior (i.e. `<vector> = rep(1, K-1)`

) over all simplexes
of the respective dimension.

8. Parameters for specific families

Some families need additional parameters to be estimated.
Families `gaussian`

, `student`

, `skew_normal`

,
`lognormal`

, and `gen_extreme_value`

need the parameter
`sigma`

to account for the residual standard deviation.
By default, `sigma`

has a half student-t prior that scales
in the same way as the group-level standard deviations.
Further, family `student`

needs the parameter
`nu`

representing the degrees of freedom of students-t distribution.
By default, `nu`

has prior `"gamma(2, 0.1)"`

and a fixed lower bound of `1`

.
Families `gamma`

, `weibull`

, `inverse.gaussian`

, and
`negbinomial`

need a `shape`

parameter that has a
`"gamma(0.01, 0.01)"`

prior by default.
For families `cumulative`

, `cratio`

, `sratio`

,
and `acat`

, and only if `threshold = "equidistant"`

,
the parameter `delta`

is used to model the distance between
two adjacent thresholds.
By default, `delta`

has an improper flat prior over the reals.
The `von_mises`

family needs the parameter `kappa`

, representing
the concentration parameter. By default, `kappa`

has prior
`"gamma(2, 0.01)"`

.
Every family specific parameter has its own prior class, so that
`set_prior("<prior>", class = "<parameter>")`

is the right way to go.
All of these priors are chosen to be weakly informative,
having only minimal influence on the estimations,
while improving convergence and sampling efficiency.

Often, it may not be immediately clear,
which parameters are present in the model.
To get a full list of parameters and parameter classes for which
priors can be specified (depending on the model)
use function `get_prior`

.

##### Value

An object of class `brmsprior`

to be used in the `prior`

argument of `brm`

.

##### Functions

`prior`

: Alias of`set_prior`

allowing to specify arguments as expressions without quotation marks.`prior_`

: Alias of`set_prior`

allowing to specify arguments as as one-sided formulas or wrapped in`quote`

.`prior_string`

: Alias of`set_prior`

allowing to specify arguments as strings.

##### References

Gelman A. (2006). Prior distributions for variance parameters in hierarchical models. Bayesian analysis, 1(3), 515 -- 534.

##### See Also

##### Examples

```
# NOT RUN {
## use alias functions
(prior1 <- prior(cauchy(0, 1), class = sd))
(prior2 <- prior_(~cauchy(0, 1), class = ~sd))
(prior3 <- prior_string("cauchy(0, 1)", class = "sd"))
identical(prior1, prior2)
identical(prior1, prior3)
## check which parameters can have priors
get_prior(rating ~ treat + period + carry + (1|subject),
data = inhaler, family = cumulative())
## define some priors
prior <- c(prior_string("normal(0,10)", class = "b"),
prior(normal(1,2), class = b, coef = treat),
prior_(~cauchy(0,2), class = ~sd,
group = ~subject, coef = ~Intercept))
## verify that the priors indeed found their way into Stan's model code
make_stancode(rating ~ treat + period + carry + (1|subject),
data = inhaler, family = cumulative(),
prior = prior)
## use the horseshoe prior to model sparsity in population-level effects
make_stancode(count ~ zAge + zBase * Trt,
data = epilepsy, family = poisson(),
prior = set_prior("horseshoe(3)"))
## alternatively use the lasso prior
make_stancode(count ~ zAge + zBase * Trt,
data = epilepsy, family = poisson(),
prior = set_prior("lasso(1)"))
## pass priors to Stan without checking
prior <- prior_string("target += normal_lpdf(b[1] | 0, 1)", check = FALSE)
make_stancode(count ~ Trt, data = epilepsy, prior = prior)
# }
```

*Documentation reproduced from package brms, version 2.9.0, License: GPL (>= 3)*