brmsformula(formula, ..., flist = NULL, family = NULL, nl = NULL, nonlinear = NULL)formula 
(or one that can be coerced to that class): 
a symbolic description of the model to be fitted. 
The details of model specification are given in 'Details'.formula objects to specify 
predictors of non-linear and auxiliary parameters. 
Formulas can either be named directly or contain
names on their left-hand side. 
The following are auxiliary parameters of specific families
(all other parameters are treated as non-linear parameters):
sigma (residual standard deviation or scale of
the gaussian, student, lognormal 
exgaussian, and asym_laplace families);
shape (shape parameter of the Gamma,
weibull, negbinomial, and related
zero-inflated / hurdle families); nu
(degrees of freedom parameter of the student family);
phi (precision parameter of the beta 
and zero_inflated_beta families);
kappa (precision parameter of the von_mises family);
beta (mean parameter of the exponential componenent
of the exgaussian family);
quantile (quantile parameter of the asym_laplace family);
zi (zero-inflation probability); 
hu (hurdle probability);
disc (discrimination) for ordinal models;
bs, ndt, and bias (boundary separation,
non-decision time, and initial bias of the wiener
diffusion model).
All auxiliary parameters are modeled 
on the log or logit scale to ensure correct definition
intervals after transformation.
See 'Details' for more explanation.... argument.link argument allowing to specify
the link function to be applied on the response variable.
If not specified, default links are used.
For details of supported families see 
brmsfamily.
By default, a linear gaussian model is applied.formula should be
treated as specifying a non-linear model. By default, formula 
is treated as an ordinary linear model formula.NULL (the default)
formula is treated as an ordinary formula. 
If not NULL, formula is treated as a non-linear model
and nonlinear should contain a formula for each non-linear 
parameter, which has the parameter on the left hand side and its
linear predictor on the right hand side.
Alternatively, it can be a single formula with all non-linear
parameters on the left hand side (separated by a +) and a
common linear predictor on the right hand side.
As of brms 1.4.0, we recommend specifying non-linear
parameters directly within formula.brmsformula, which
  is essentially a list containing all model
  formulas as well as some additional information.
formula argument accepts formulae of the following syntax:
  
  response | aterms ~ pterms + (gterms | group) 
  
  The pterms part contains effects that are assumed to be the 
  same across obervations. We call them 'population-level' effects
  or (adopting frequentist vocabulary) 'fixed' effects. The optional
  gterms part may contain effects that are assumed to vary
  accross grouping variables specified in group. We
  call them 'group-level' effects or (adopting frequentist 
  vocabulary) 'random' effects, although the latter name is misleading
  in a Bayesian context.
  For more details type vignette("brms_overview").
  
  Group-level terms
  
  Multiple grouping factors each with multiple group-level effects 
  are possible. 
  Instead of | you may use || in grouping terms
  to prevent correlations from being modeled. 
  Alternatively, it is possible to model different group-level terms of 
  the same grouping factor as correlated (even across different formulae,
  e.g., in non-linear models) by using ||  instead of |.
  All group-level terms sharing the same ID will be modeled as correlated.
  If, for instance, one specifies the terms (1+x|2|g) and 
  (1+z|2|g) somewhere in the formulae passed to brmsformula,
  correlations between the corresponding group-level effects 
  will be estimated. 
  
  You can specify multi-membership terms
  using the mm function. For instance, 
  a multi-membership term with two members could be
  (1|mm(g1, g2)), where g1 and g2 specify
  the first and second member, respectively.
  
  Special predictor terms
  
  Smoothing terms can modeled using the s
  and t2 functions of the mgcv package 
  in the pterms part of the model formula.
  This allows to fit generalized additive mixed models (GAMMs) with brms. 
  The implementation is similar to that used in the gamm4 package.
  For more details on this model class see gam 
  and gamm.
  
  The pterms and gterms parts may contain three non-standard
  effect types namely monotonic, measurement error, and category specific effects,
  which can be specified using terms of the form mo() ,
  me(predictor, sd_predictor), and cs() , 
  respectively. Category specific effects can only be estimated in
  ordinal models and are explained in more detail in the package's 
  main vignette (type vignette("brms_overview")). 
  The other two effect types are explained in the following.
  
  A monotonic predictor must either be integer valued or an ordered factor, 
  which is the first difference to an ordinary continuous predictor. 
  More importantly, predictor categories (or integers) are not assumend to be 
  equidistant with respect to their effect on the response variable. 
  Instead, the distance between adjacent predictor categories (or integers) 
  is estimated from the data and may vary across categories. 
  This is realized by parameterizing as follows: 
  One parameter takes care of the direction and size of the effect similar 
  to an ordinary regression parameter, while an additional parameter vector 
  estimates the normalized distances between consecutive predictor categories.     
  A main application of monotonic effects are ordinal predictors that
  can this way be modeled without (falsely) treating them as continuous
  or as unordered categorical predictors. For more details and examples
  see vignette("brms_monotonic").
  
  Quite often, predictors are measured and as such naturally contain 
  measurement error. Although most reseachers are well aware of this problem,
  measurement error in predictors is ignored in most
  regression analyses, possibly because only few packages allow
  for modelling it. Notably, measurement error can be handled in 
  structural equation models, but many more general regression models
  (such as those featured by brms) cannot be transferred 
  to the SEM framework. In brms, effects of noise-free predictors 
  can be modeled using the me (for 'measurement error') function.
  If, say, y is the response variable and 
  x is a measured predictor with known measurement error
  sdx, we can simply include it on the right-hand side of the
  model formula via y ~ me(x, sdx). 
  This can easily be extended to more general formulae. 
  If x2 is another measured predictor with corresponding error
  sdx2 and z is a predictor without error
  (e.g., an experimental setting), we can model all main effects 
  and interactions of the three predictors in the well known manner: 
  y ~ me(x, sdx) * me(x2, sdx2) * z. In future version of brms,
  a vignette will be added to explain more details about these
  so called 'error-in-variables' models and provide real world examples.
  
  Additional response information
  
  Another speciality of the brms formula syntax is the optional 
  aterms part, which may contain 
  multiple terms of the form fun()  seperated by + each 
  providing special information on the response variable. fun can be 
  replaced with either se, weights, disp, trials,
  cat, cens, trunc, or dec.
  Their meanings are explained below 
  (see also addition-terms). 
  
  For families gaussian and student, it is 
  possible to specify standard errors of the observation, thus allowing 
  to perform meta-analysis. Suppose that the variable yi contains 
  the effect sizes from the studies and sei the corresponding 
  standard errors. Then, fixed and random effects meta-analyses can 
  be conducted using the formulae yi | se(sei) ~ 1 and 
  yi | se(sei) ~ 1 + (1|study), respectively, where 
  study is a variable uniquely identifying every study.
  If desired, meta-regression can be performed via 
  yi | se(sei) ~ 1 + mod1 + mod2 + (1|study) 
  or  yi | se(sei) ~ 1 + mod1 + mod2 + (1 + mod1 + mod2|study), 
  where mod1 and mod2 represent moderator variables. 
  By default, the standard errors replace the paramter sigma.
  To model sigma in addition to the known standard errors,
  set argument sigma in function se to TRUE, 
  for instance, yi | se(sei, sigma = TRUE) ~ 1.
  
  For all families, weighted regression may be performed using
  weights in the aterms part. Internally, this is 
  implemented by multiplying the log-posterior values of each 
  observation by their corresponding weights.
  Suppose that variable wei contains the weights 
  and that yi is the response variable. 
  Then, formula yi | weights(wei) ~ predictors 
  implements a weighted regression. 
  
  The addition argument disp (short for dispersion) serves a
  similar purpose than weight. However, it has a different 
  implementation and is less general as it is only usable for the
  families gaussian, student, lognormal,
  exgaussian, asym_laplace, Gamma, 
  weibull, and negbinomial.
  For the former three families, the residual standard deviation 
  sigma is multiplied by the values given in 
  disp, so that higher values lead to lower weights.
  Contrariwise, for the latter three families, the parameter shape
  is multiplied by the values given in disp. As shape
  can be understood as a precision parameter (inverse of the variance),
  higher values will lead to higher weights in this case.
  
  For families binomial and zero_inflated_binomial, 
  addition should contain a variable indicating the number of trials 
  underlying each observation. In lme4 syntax, we may write for instance 
  cbind(success, n - success), which is equivalent
  to success | trials(n) in brms syntax. If the number of trials
  is constant across all observations, say 10, 
  we may also write success | trials(10). 
  
  For all ordinal families, aterms may contain a term 
  cat(number) to specify the number categories (e.g, cat(7)). 
  If not given, the number of categories is calculated from the data.
  
  With the expection of categorical and ordinal families, 
  left, right, and interval censoring can be modeled through 
  y | cens(censored) ~ predictors. The censoring variable 
  (named censored in this example) should contain the values 
  'left', 'none', 'right', and 'interval' 
  (or equivalenty -1, 0, 1, and 2) to indicate that 
  the corresponding observation is left censored, not censored, right censored,
  or interval censored. For interval censored data, a second variable
  (let's call it y2) has to be passed to cens. In this case, 
  the formula has the structure y | cens(censored, y2) ~ predictors. 
  While the lower bounds are given in y, 
  the upper bounds are given in y2 for interval
  censored data. Intervals are assumed to be open on the left and closed 
  on the right: (y, y2].
  
  With the expection of categorical and ordinal families, the response 
  distribution can be truncated using the trunc function in the addition part.
  If the response variable is truncated between, say, 0 and 100, we can specify this via
  yi | trunc(lb = 0, ub = 100) ~ predictors. 
  Instead of numbers, variables in the data set can also be passed allowing 
  for varying truncation points across observations. 
  Defining only one of the two arguments in trunc 
  leads to one-sided truncation.
  
  In Wiener diffusion models (family wiener) the addition term
  dec is mandatory to specify the (vector of) binary decisions 
  corresponding to the reaction times. Non-zero values will be treated
  as a response on the upper boundary of the diffusion process and zeros
  will be treated as a response on the lower boundary. Alternatively,
  the variable passed to dec might also be a character vector 
  consisting of 'lower' and 'upper'.  Mutiple addition terms may be specified at the same time using 
  the + operator, for instance 
  formula = yi | se(sei) + cens(censored) ~ 1 
  for a censored meta-analytic model. 
  
  Formula syntax for multivariate and categorical models
  
  For families gaussian and student,
  multivariate models may be specified using cbind notation. 
  In brms 1.0.0, the multvariate 'trait' syntax was removed 
  from the package as it repeatedly confused users, required much 
  special case coding, and was hard to maintain. Below the new 
  syntax is described. 
  Suppose that y1 and y2 are response variables 
  and x is a predictor. 
  Then cbind(y1,y2) ~ x specifies a multivariate model,
  The effects of all terms specified at the RHS of the formula 
  are assumed to vary across response variables (this was not the
  case by default in brms < 1.0.0). For instance, two parameters will
  be estimated for x, one for the effect
  on y1 and another for the effect on y2.
  This is also true for group-level effects. When writing, for instance,
  cbind(y1,y2) ~ x + (1+x|g), group-level effects will be
  estimated separately for each response. To model these effects
  as correlated across responses, use the ID syntax (see above).
  For the present example, this would look as follows:
  cbind(y1,y2) ~ x + (1+x|2|g). Of course, you could also use
  any value other than 2 as ID. It is not yet possible
  to model terms as only affecting certain responses (and not others),
  but this will be implemented in the future.
   
  Categorical models use the same syntax as multivariate
  models. As in most other implementations of categorical models,
  values of one category (the first in brms) are fixed 
  to identify the model. Thus, all terms on the RHS of 
  the formula correspond to K - 1 effects 
  (K = number of categories), one for each non-fixed category.
  Group-level effects may be specified as correlated across
  categories using the ID syntax.
  
  As of brms 1.0.0, zero-inflated and hurdle models are specfied 
  in the same way as as their non-inflated counterparts. 
  However, they have additional auxiliary parameters 
  (named zi and hu respectively)
  modeling the zero-inflation / hurdle probability depending on which 
  model you choose. These parameters can also be affected by predictors
  in the same way the response variable itself. See the end of the
  Details section for information on how to accomplish that.
  
  Parameterization of the population-level intercept
  
  The population-level intercept (if incorporated) is estimated separately 
  and not as part of population-level parameter vector b. 
  also have to be specified separately
  (see set_prior for more details).
  Furthermore, to increase sampling efficiency, the population-level 
  design matrix X is centered around its column means 
  X_means if the intercept is incorporated. 
  This leads to a temporary bias in the intercept equal to 
  <,>,> is the scalar product. 
  The bias is corrected after fitting the model, but be aware 
  that you are effectively defining a prior on the temporary
  intercept of the centered design matrix not on the real intercept.
  
  This behavior can be avoided by using the reserved 
  (and internally generated) variable intercept. 
  Instead of y ~ x, you may write
  y ~ 0 + intercept + x. This way, priors can be
  defined on the real intercept, directly. In addition,
  the intercept is just treated as an ordinary population-level effect
  and thus priors defined on b will also apply to it. 
  Note that this parameterization may be less efficient
  than the default parameterization discussed above.  
  
  Formula syntax for non-linear models
  
  In brms, it is possible to specify non-linear models 
  of arbitrary complexity.
  The non-linear model can just be specified within the formula
  argument. Suppose, that we want to predict the response y
  through the predictor x, where x is linked to y
  through y = alpha - beta * lambda^x, with parameters
  alpha, beta, and lambda. This is certainly a
  non-linear model being defined via
  formula = y ~ alpha - beta * lambda^x (addition arguments 
  can be added in the same way as for ordinary formulas).
  To tell brms that this is a non-linear model, 
  we set argument nl to TRUE.
  Now we have to specfiy a model for each of the non-linear parameters. 
  Let's say we just want to estimate those three parameters
  with no further covariates or random effects. Then we can pass
  alpha + beta + lambda ~ 1 or equivalently
  (and more flexible) alpha ~ 1, beta ~ 1, lambda ~ 1 
  to the ... argument.
  This can, of course, be extended. If we have another predictor z and 
  observations nested within the grouping factor g, we may write for 
  instance alpha ~ 1, beta ~ 1 + z + (1|g), lambda ~ 1.
  The formula syntax described above applies here as well.
  In this example, we are using z and g only for the 
  prediction of beta, but we might also use them for the other
  non-linear parameters (provided that the resulting model is still 
  scientifically reasonable). 
  
  Non-linear models may not be uniquely identified and / or show bad convergence.
  For this reason it is mandatory to specify priors on the non-linear parameters.
  For instructions on how to do that, see set_prior.
  
  Formula syntax for predicting auxiliary parameters
  
  It is also possible to predict auxiliary parameters of the response
  distribution such as the residual standard deviation sigma 
  in gaussian models or the hurdle probability hu in hurdle models. 
  The syntax closely resembles that of a non-linear 
  parameter, for instance sigma ~ x + s(z) + (1+x|g).
  
  Alternatively, one may fix auxiliary parameters to certain values.
  However, this is mainly useful when models become too 
  complicated and otherwise have convergence issues. 
  We thus suggest to be generally careful when making use of this option. 
  The quantile parameter of the asym_laplace distribution
  is a good example where it is useful. By fixing quantile, 
  one can perform quantile regression for the specified quantile. 
  For instance, quantile = 0.25 allows predicting the 25%-quantile.
  Furthermore, the bias parameter in drift-diffusion models, 
  is assumed to be 0.5 (i.e. no bias) in many applications. 
  To achieve this, simply write bias = 0.5. 
  Other possible applications are the Cauchy 
  distribution as a special case of the Student-t distribution with 
  nu = 1, or the geometric distribution as a special case of
  the negative binomial distribution with shape = 1.
  Furthermore, the parameter disc ('discrimination') in ordinal 
  models is fixed to 1 by default and not estimated,
  but may be modeled as any other auxiliary parameter if desired
  (see examples). For reasons of identification, 'disc'
  can only be positive, which is achieved by applying the log-link.
  
  All auxiliary parameters currently supported by brmsformula
  have to positive (a negative standard deviation or precision parameter 
  doesn't make any sense) or are bounded between 0 and 1 (for zero-inflated / 
  hurdle proabilities, quantiles, or the intial bias parameter of 
  drift-diffusion models). 
  However, linear predictors can be positive or negative, and thus
  the log link (for positive parameters) or logit link (for probability parameters) 
  are used to ensure that auxiliary parameters are within their valid intervals.
  This implies that effects for auxiliary parameters are estimated on the
  log / logit scale and one has to apply the inverse link function to get 
  to the effects on the original scale. 
  See also brmsfamily for an overview of 
  valid link functions.
# multilevel model with smoothing terms
brmsformula(y ~ x1*x2 + s(z) + (1+x1|1) + (1|g2))
# additionally predict 'sigma'
brmsformula(y ~ x1*x2 + s(z) + (1+x1|1) + (1|g2), 
            sigma ~ x1 + (1|g2))
            
# use the shorter alias 'bf'
(formula1 <- brmsformula(y ~ x + (x|g)))
(formula2 <- bf(y ~ x + (x|g)))
# will be TRUE
identical(formula1, formula2)
# incorporate censoring
bf(y | cens(censor_variable) ~ predictors)
# define a simple non-linear model
bf(y ~ a1 - a2^x, a1 + a2 ~ 1, nl = TRUE)
# predict a1 and a2 differently
bf(y ~ a1 - a2^x, a1 ~ 1, a2 ~ x + (x|g), nl = TRUE)
# correlated group-level effects across parameters
bf(y ~ a1 - a2^x, a1 ~ 1 + (1|2|g), a2 ~ x + (x|2|g), nl = TRUE)
# define a multivariate model
bf(cbind(y1, y2) ~ x * z + (1|g))
# define a zero-inflated model 
# also predicting the zero-inflation part
bf(y ~ x * z + (1+x|ID1|g), zi ~ x + (1|ID1|g))
# specify a predictor as monotonic
bf(y ~ mo(x) + more_predictors)
# for ordinal models only
# specify a predictor as category specific
bf(y ~ cs(x) + more_predictors)
# add a category specific group-level intercept
bf(y ~ cs(x) + (cs(1)|g))
# specify parameter 'disc'
bf(y ~ person + item, disc ~ item)
# specify variables containing measurement error
bf(y ~ me(x, sdx))
# specify predictors on all parameters of the wiener diffusion model
# the main formula models the drift rate 'delta'
bf(rt | dec(decision) ~ x, bs ~ x, ndt ~ x, bias ~ x)
# fix the bias parameter to 0.5
bf(rt | dec(decision) ~ x, bias = 0.5)
Run the code above in your browser using DataLab