Define priors for specific parameters or classes of parameters
set_prior(prior, class = "b", coef = "", group = "", resp = "",
  dpar = "", nlpar = "", lb = NULL, ub = NULL, check = TRUE)prior(prior, ...)
prior_(prior, ...)
prior_string(prior, ...)
A character string defining a distribution in Stan language
The parameter class. Defaults to "b" 
(i.e. population-level effects). 
See 'Details' for other valid parameter classes.
Name of the (population- or group-level) parameter.
Grouping factor of group-level parameters.
Name of the response variable / category. Only used in multivariate and categorical models.
Name of a distributional parameter. Only used in distributional models.
Name of a non-linear parameter. Only used in non-linear models.
Lower bound for parameter restriction. Currently only allowed
for classes "b", "ar", "ma", and "arr".
Defaults to NULL, that is no restriction.
Upper bound for parameter restriction. Currently only allowed
for classes "b", "ar", "ma", and "arr".
Defaults to NULL, that is no restriction.
Logical; Indicates whether priors
should be checked for validity (as far as possible).
Defaults to TRUE. If FALSE, prior is passed
to the Stan code as is, and all other arguments are ignored.
Arguments passed to set_prior.
An object of class brmsprior to be used in the prior
  argument of brm.
prior: Alias of set_prior allowing to 
specify arguments as expressions without quotation marks.
prior_: Alias of set_prior allowing to specify 
arguments as as one-sided formulas or wrapped in quote.
prior_string: Alias of set_prior allowing to
specify arguments as strings.
set_prior is used to define prior distributions for parameters 
  in brms models. The functions prior, prior_, and
  prior_string are aliases of set_prior each allowing
  for a different kind of argument specification. 
  prior allows specifying arguments as expression without
  quotation marks using non-standard evaluation. 
  prior_ allows specifying arguments as one-sided formulas
  or wrapped in quote.
  prior_string allows specifying arguments as strings just
  as set_prior itself.
Below, we explain its usage and list some common prior distributions for parameters. A complete overview on possible prior distributions is given in the Stan Reference Manual available at http://mc-stan.org/.
To combine multiple priors, use c(...) or the + operator 
  (see 'Examples'). brms does not check if the priors are written 
  in correct Stan language. Instead, Stan will check their 
  syntactical correctness when the model is parsed to C++ and 
  returns an error if they are not. 
  This, however, does not imply that priors are always meaningful if they are 
  accepted by Stan. Although brms trys to find common problems 
  (e.g., setting bounded priors on unbounded parameters), there is no guarantee 
  that the defined priors are reasonable for the model.
  Below, we list the types of parameters in brms models, 
  for which the user can specify prior distributions.
1. Population-level ('fixed') effects
Every Population-level effect has its own regression parameter 
  represents the name of the corresponding population-level effect. 
  Suppose, for instance, that y is predicted by x1 and x2 
  (i.e., y ~ x1 + x2 in formula syntax). 
  Then, x1 and x2 have regression parameters 
  b_x1 and b_x2 respectively. 
  The default prior for population-level effects (including monotonic and 
  category specific effects) is an improper flat prior over the reals. 
  Other common options are normal priors or student-t priors. 
  If we want to have a normal prior with mean 0 and 
  standard deviation 5 for x1, and a unit student-t prior with 10 
  degrees of freedom for x2, we can specify this via
  set_prior("normal(0,5)", class = "b", coef = "x1") and 
  set_prior("student_t(10,0,1)", class = "b", coef = "x2").
  To put the same prior on all population-level effects at once, 
  we may write as a shortcut set_prior("<prior>", class = "b"). 
  This also leads to faster sampling, because priors can be vectorized in this case. 
  Both ways of defining priors can be combined using for instance 
  set_prior("normal(0,2)", class = "b") and 
  set_prior("normal(0,10)", class = "b", coef = "x1")
  at the same time. This will set a normal(0,10) prior on 
  the effect of x1 and a normal(0,2) prior 
  on all other population-level effects. 
  However, this will break vectorization and
  may slow down the sampling procedure a bit.
In case of the default intercept parameterization 
  (discussed in the 'Details' section of 
  brmsformula),
  general priors on class "b" will not affect 
  the intercept. Instead, the intercept has its own parameter class 
  named "Intercept" and priors can thus be 
  specified via set_prior("<prior>", class = "Intercept").
  Setting a prior on the intercept will not break vectorization
  of the other population-level effects.
  Note that technically, this prior is set on an intercept that
  results when internally centering all population-level predictors 
  around zero to improve sampling efficiency. On this centered 
  intercept, specifying a prior is actually much easier and 
  intuitive than on the original intercept, since the former 
  represents the expected response value when all predictors 
  are at their means. To treat the intercept as an ordinary 
  population-level effect and avoid the centering parameterization, 
  use 0 + intercept on the right-hand side of the model formula.
A special shrinkage prior to be applied on population-level effects 
  is the horseshoe prior. See horseshoe
  for details. Another shrinkage prior is the so-called lasso prior.
  See lasso for details.
In non-linear models, population-level effects are defined separately 
  for each non-linear parameter. Accordingly, it is necessary to specify
  the non-linear parameter in set_prior so that priors
  we can be assigned correctly. 
  If, for instance, alpha is the parameter and x the predictor
  for which we want to define the prior, we can write
  set_prior("<prior>", coef = "x", nlpar = "alpha"). 
  As a shortcut we can use set_prior("<prior>", nlpar = "alpha")
  to set the same prior on all population-level effects of alpha at once.
If desired, population-level effects can be restricted to fall only 
  within a certain interval using the lb and ub arguments
  of set_prior. This is often required when defining priors
  that are not defined everywhere on the real line, such as uniform
  or gamma priors. When defining a uniform(2,4) prior, 
  you should write set_prior("uniform(2,4)", lb = 2, ub = 4). 
  When using a prior that is defined on the positive reals only 
  (such as a gamma prior) set lb = 0. 
  In most situations, it is not useful to restrict population-level
  parameters through bounded priors 
  (non-linear models are an important exception), 
  but if you really want to this is the way to go.
2. Standard deviations of group-level ('random') effects
Each group-level effect of each grouping factor has a standard deviation named
  sd_<group>_<coef>. Consider, for instance, the formula 
  y ~ x1 + x2 + (1 + x1 | g).
  We see that the intercept as well as x1 are group-level effects
  nested in the grouping factor g. 
  The corresponding standard deviation parameters are named as 
  sd_g_Intercept and sd_g_x1 respectively. 
  These parameters are restricted to be non-negative and, by default, 
  have a half student-t prior with 3 degrees of freedom and a 
  scale parameter that depends on the standard deviation of the response 
  after applying the link function. Minimally, the scale parameter is 10. 
  This prior is used (a) to be only very weakly informative in order to influence
  results as few as possible, while (b) providing at least some regularization
  to considerably improve convergence and sampling efficiency.
  To define a prior distribution only for standard deviations 
  of a specific grouping factor,
  use  set_prior("<prior>", class = "sd", group = "<group>"). 
  To define a prior distribution only for a specific standard deviation 
  of a specific grouping factor, you may write 
  set_prior("<prior>", class = "sd", group = "<group>", coef = "<coef>"). 
  Recommendations on useful prior distributions for 
  standard deviations are given in Gelman (2006), but note that he
  is no longer recommending uniform priors, anymore.
When defining priors on group-level parameters in non-linear models, 
  please make sure to specify the corresponding non-linear parameter 
  through the nlpar argument in the same way as 
  for population-level effects.
3. Correlations of group-level ('random') effects
If there is more than one group-level effect per grouping factor, 
  the correlations between those effects have to be estimated. 
  The prior "lkj_corr_cholesky(eta)" or in short 
  "lkj(eta)" with eta > 0 
  is essentially the only prior for (Cholesky factors) of correlation matrices. 
  If eta = 1 (the default) all correlations matrices 
  are equally likely a priori. If eta > 1, extreme correlations 
  become less likely, whereas 0 < eta < 1 results in 
  higher probabilities for extreme correlations. 
  Correlation matrix parameters in brms models are named as 
  cor_<group>, (e.g., cor_g if g is the grouping factor).
  To set the same prior on every correlation matrix, 
  use for instance set_prior("lkj(2)", class = "cor").
  Internally, the priors are transformed to be put on the Cholesky factors
  of the correlation matrices to improve efficiency and numerical stability.
  The corresponding parameter class of the Cholesky factors is L,
  but it is not recommended to specify priors for this parameter class directly.
4. Smooth terms (splines)
GAMMs are implemented in brms using the 'random effects' 
  formulation of smooth terms (for details see 
  gamm). Thus, each smooth term
  has its corresponding standard deviation modeling
  the variability within this term. In brms, this 
  parameter class is called sds and priors can
  be specified via set_prior("<prior>", class = "sds", 
  coef = "<term label>"). The default prior is the same as
  for standard deviations of group-level effects.
5. Gaussian processes
Gaussian processes as currently implemented in brms have
  two parameters, the standard deviation parameter sdgp, 
  and characteristic length-scale parameter lscale 
  (see gp for more details). The default prior 
  of sdgp is the same as for standard deviations of 
  group-level effects. The default prior of lscale
  is an informative inverse-gamma prior specifically tuned 
  to the covariates of the Gaussian process (for more details see
  https://betanalpha.github.io/assets/case_studies/gp_part3/part3.html).
  This tuned prior may be overly informative in some cases, so please 
  consider other priors as well to make sure inference is
  robust to the prior specification. If tuning fails, a half-normal prior 
  is used instead.
6. Autocorrelation parameters
The autocorrelation parameters currently implemented are named 
  ar (autoregression), ma (moving average),
  and arr (autoregression of the response).
Priors can be defined by set_prior("<prior>", class = "ar") 
  for ar and similar for ma and arr effects.
  By default, ar and ma are bounded between -1 
  and 1 and arr is unbounded (you may change this 
  by using the arguments lb and ub). The default
  prior is flat over the definition area.
7. Distance parameters of monotonic effects
As explained in the details section of brm,
  monotonic effects make use of a special parameter vector to
  estimate the 'normalized distances' between consecutive predictor 
  categories. This is realized in Stan using the simplex
  parameter type. This class is named "simo" (short for 
  simplex monotonic) in brms. 
  The only valid prior for simplex parameters is the
  dirichlet prior, which accepts a vector of length K - 1
  (K = number of predictor categories) as input defining the
  'concentration' of the distribution. Explaining the dirichlet prior 
  is beyond the scope of this documentation, but we want to describe
  how to define this prior syntactically correct.
  If a predictor x with K categories is modeled as monotonic, 
  we can define a prior on its corresponding simplex via 
  prior(dirichlet(<vector>), class = simo, coef = mox1).
  The 1 in the end of coef indicates that this is the first
  simplex in this term. If interactions between multiple monotonic
  variables are modeled, multiple simplexes per term are required.  
  For <vector>, we can put in any R expression
  defining a vector of length K - 1. The default is a uniform 
  prior (i.e. <vector> = rep(1, K-1)) over all simplexes
  of the respective dimension.
8. Parameters for specific families
Some families need additional parameters to be estimated. 
  Families gaussian, student, skew_normal,
  lognormal, and gen_extreme_value need the parameter 
  sigma to account for the residual standard deviation.
  By default, sigma has a half student-t prior that scales 
  in the same way as the group-level standard deviations.
  Further, family student needs the parameter 
  nu representing the degrees of freedom of students-t distribution. 
  By default, nu has prior "gamma(2, 0.1)"
  and a fixed lower bound of 1.
  Families gamma, weibull, inverse.gaussian, and
  negbinomial need a shape parameter that has a 
  "gamma(0.01, 0.01)" prior by default. 
  For families cumulative, cratio, sratio, 
  and acat, and only if threshold = "equidistant", 
  the parameter delta is used to model the distance between 
  two adjacent thresholds. 
  By default, delta has an improper flat prior over the reals.
  The von_mises family needs the parameter kappa, representing
  the concentration parameter. By default, kappa has prior 
  "gamma(2, 0.01)". 
  Every family specific parameter has its own prior class, so that
  set_prior("<prior>", class = "<parameter>") is the right way to go.
  All of these priors are chosen to be weakly informative,
  having only minimal influence on the estimations,
  while improving convergence and sampling efficiency.
Often, it may not be immediately clear, 
  which parameters are present in the model.
  To get a full list of parameters and parameter classes for which 
  priors can be specified (depending on the model) 
  use function get_prior.
Gelman A. (2006). Prior distributions for variance parameters in hierarchical models. Bayesian analysis, 1(3), 515 -- 534.
# NOT RUN {
## use alias functions
(prior1 <- prior(cauchy(0, 1), class = sd))
(prior2 <- prior_(~cauchy(0, 1), class = ~sd))
(prior3 <- prior_string("cauchy(0, 1)", class = "sd"))
identical(prior1, prior2)
identical(prior1, prior3)
## check which parameters can have priors
get_prior(rating ~ treat + period + carry + (1|subject),
          data = inhaler, family = cumulative())
         
## define some priors          
prior <- c(prior_string("normal(0,10)", class = "b"),
           prior(normal(1,2), class = b, coef = treat),
           prior_(~cauchy(0,2), class = ~sd, 
                  group = ~subject, coef = ~Intercept))
              
## verify that the priors indeed found their way into Stan's model code
make_stancode(rating ~ treat + period + carry + (1|subject),
              data = inhaler, family = cumulative(),
              prior = prior)
              
## use the horseshoe prior to model sparsity in population-level effects
make_stancode(count ~ log_Age_c + log_Base4_c * Trt_c,
              data = epilepsy, family = poisson(),
              prior = set_prior("horseshoe(3)"))
              
## alternatively use the lasso prior
make_stancode(count ~ log_Age_c + log_Base4_c * Trt_c,
              data = epilepsy, family = poisson(),
              prior = set_prior("lasso(1)"))
## pass priors to Stan without checking
prior <- prior_string("target += normal_lpdf(b[1] | 0, 1)", check = FALSE)
make_stancode(count ~ Trt_c, data = epilepsy, prior = prior)
# }
Run the code above in your browser using DataLab