brms (version 1.10.2)

brmsformula: Set up a model formula for use in brms

Description

Set up a model formula for use in the brms package allowing to define (potentially non-linear) additive multilevel models for all parameters of the assumed response distribution.

Usage

```brmsformula(formula, ..., flist = NULL, family = NULL, autocor = NULL,
nl = NULL, nonlinear = NULL)```

Arguments

formula

An object of class `formula` (or one that can be coerced to that class): a symbolic description of the model to be fitted. The details of model specification are given in 'Details'.

...

Additional `formula` objects to specify predictors of non-linear and distributional parameters. Formulas can either be named directly or contain names on their left-hand side. The following are distributional parameters of specific families (all other parameters are treated as non-linear parameters): `sigma` (residual standard deviation or scale of the `gaussian`, `student`, `lognormal` `exgaussian`, and `asym_laplace` families); `shape` (shape parameter of the `Gamma`, `weibull`, `negbinomial`, and related zero-inflated / hurdle families); `nu` (degrees of freedom parameter of the `student` family); `phi` (precision parameter of the `beta` and `zero_inflated_beta` families); `kappa` (precision parameter of the `von_mises` family); `beta` (mean parameter of the exponential componenent of the `exgaussian` family); `quantile` (quantile parameter of the `asym_laplace` family); `zi` (zero-inflation probability); `hu` (hurdle probability); `zoi` (zero-one-inflation probability); `coi` (conditional one-inflation probability); `disc` (discrimination) for ordinal models; `bs`, `ndt`, and `bias` (boundary separation, non-decision time, and initial bias of the `wiener` diffusion model). All distributional parameters are modeled on the log or logit scale to ensure correct definition intervals after transformation. See 'Details' for more explanation.

flist

Optional list of formulas, which are treated in the same way as formulas passed via the `...` argument.

family

Same argument as in `brm`. If `family` is specified `brmsformula`, it will overwrite the value specified in `brm`.

autocor

An optional `cor_brms` object describing the correlation structure within the response variable (i.e., the 'autocorrelation'). See the documentation of `cor_brms` for a description of the available correlation structures. Defaults to `NULL`, corresponding to no correlations.

nl

Logical; Indicates whether `formula` should be treated as specifying a non-linear model. By default, `formula` is treated as an ordinary linear model formula.

nonlinear

(Deprecated) An optional list of formulas, specifying linear models for non-linear parameters. If `NULL` (the default) `formula` is treated as an ordinary formula. If not `NULL`, `formula` is treated as a non-linear model and `nonlinear` should contain a formula for each non-linear parameter, which has the parameter on the left hand side and its linear predictor on the right hand side. Alternatively, it can be a single formula with all non-linear parameters on the left hand side (separated by a `+`) and a common linear predictor on the right hand side. As of brms 1.4.0, we recommend specifying non-linear parameters directly within `formula`.

Value

An object of class `brmsformula`, which is essentially a `list` containing all model formulas as well as some additional information.

Details

General formula structure

The `formula` argument accepts formulae of the following syntax:

`response | aterms ~ pterms + (gterms | group)`

The `pterms` part contains effects that are assumed to be the same across obervations. We call them 'population-level' effects or (adopting frequentist vocabulary) 'fixed' effects. The optional `gterms` part may contain effects that are assumed to vary accross grouping variables specified in `group`. We call them 'group-level' effects or (adopting frequentist vocabulary) 'random' effects, although the latter name is misleading in a Bayesian context. For more details type `vignette("brms_overview")` and `vignette("brms_multilevel")`.

Group-level terms

Multiple grouping factors each with multiple group-level effects are possible (of course can also run models without any group-level effects). Instead of `|` you may use `||` in grouping terms to prevent correlations from being modeled. Alternatively, it is possible to model different group-level terms of the same grouping factor as correlated (even across different formulae, e.g., in non-linear models) by using `|<ID>|` instead of `|`. All group-level terms sharing the same ID will be modeled as correlated. If, for instance, one specifies the terms `(1+x|2|g)` and `(1+z|2|g)` somewhere in the formulae passed to `brmsformula`, correlations between the corresponding group-level effects will be estimated.

You can specify multi-membership terms using the `mm` function. For instance, a multi-membership term with two members could be `(1|mm(g1, g2))`, where `g1` and `g2` specify the first and second member, respectively.

Special predictor terms

Smoothing terms can modeled using the `s` and `t2` functions in the `pterms` part of the model formula. This allows to fit generalized additive mixed models (GAMMs) with brms. The implementation is similar to that used in the gamm4 package. For more details on this model class see `gam` and `gamm`.

Gaussian process terms can be fitted using the `gp` function in the `pterms` part of the model formula. Similar to smooth terms, Gaussian processes can be used to model complex non-linear relationsships, for instance temporal or spatial autocorrelation. However, they are computationally demanding and are thus not recommended for very large datasets.

The `pterms` and `gterms` parts may contain three non-standard effect types namely monotonic, measurement error, and category specific effects, which can be specified using terms of the form `mo(predictor)`, `me(predictor, sd_predictor)`, and `cs(<predictors>)`, respectively. Category specific effects can only be estimated in ordinal models and are explained in more detail in the package's main vignette (type `vignette("brms_overview")`). The other two effect types are explained in the following.

A monotonic predictor must either be integer valued or an ordered factor, which is the first difference to an ordinary continuous predictor. More importantly, predictor categories (or integers) are not assumend to be equidistant with respect to their effect on the response variable. Instead, the distance between adjacent predictor categories (or integers) is estimated from the data and may vary across categories. This is realized by parameterizing as follows: One parameter takes care of the direction and size of the effect similar to an ordinary regression parameter, while an additional parameter vector estimates the normalized distances between consecutive predictor categories. A main application of monotonic effects are ordinal predictors that can this way be modeled without (falsely) treating them as continuous or as unordered categorical predictors. For more details and examples see `vignette("brms_monotonic")`.

Quite often, predictors are measured and as such naturally contain measurement error. Although most reseachers are well aware of this problem, measurement error in predictors is ignored in most regression analyses, possibly because only few packages allow for modelling it. Notably, measurement error can be handled in structural equation models, but many more general regression models (such as those featured by brms) cannot be transferred to the SEM framework. In brms, effects of noise-free predictors can be modeled using the `me` (for 'measurement error') function. If, say, `y` is the response variable and `x` is a measured predictor with known measurement error `sdx`, we can simply include it on the right-hand side of the model formula via `y ~ me(x, sdx)`. This can easily be extended to more general formulae. If `x2` is another measured predictor with corresponding error `sdx2` and `z` is a predictor without error (e.g., an experimental setting), we can model all main effects and interactions of the three predictors in the well known manner: `y ~ me(x, sdx) * me(x2, sdx2) * z`. In future version of brms, a vignette will be added to explain more details about these so called 'error-in-variables' models and provide real world examples.

Another speciality of the brms formula syntax is the optional `aterms` part, which may contain multiple terms of the form `fun(<variable>)` seperated by `+` each providing special information on the response variable. `fun` can be replaced with either `se`, `weights`, `disp`, `trials`, `cat`, `cens`, `trunc`, or `dec`. Their meanings are explained below (see also `addition-terms`).

For families `gaussian` and `student`, it is possible to specify standard errors of the observation, thus allowing to perform meta-analysis. Suppose that the variable `yi` contains the effect sizes from the studies and `sei` the corresponding standard errors. Then, fixed and random effects meta-analyses can be conducted using the formulae `yi | se(sei) ~ 1` and `yi | se(sei) ~ 1 + (1|study)`, respectively, where `study` is a variable uniquely identifying every study. If desired, meta-regression can be performed via `yi | se(sei) ~ 1 + mod1 + mod2 + (1|study)` or `yi | se(sei) ~ 1 + mod1 + mod2 + (1 + mod1 + mod2|study)`, where `mod1` and `mod2` represent moderator variables. By default, the standard errors replace the paramter `sigma`. To model `sigma` in addition to the known standard errors, set argument `sigma` in function `se` to `TRUE`, for instance, `yi | se(sei, sigma = TRUE) ~ 1`.

For all families, weighted regression may be performed using `weights` in the `aterms` part. Internally, this is implemented by multiplying the log-posterior values of each observation by their corresponding weights. Suppose that variable `wei` contains the weights and that `yi` is the response variable. Then, formula `yi | weights(wei) ~ predictors` implements a weighted regression.

(DEPRECATED) The addition argument `disp` (short for dispersion) serves a similar purpose than `weight`. However, it has a different implementation and is less general as it is only usable for the families `gaussian`, `student`, `lognormal`, `exgaussian`, `asym_laplace`, `Gamma`, `weibull`, and `negbinomial`. For the former three families, the residual standard deviation `sigma` is multiplied by the values given in `disp`, so that higher values lead to lower weights. Contrariwise, for the latter three families, the parameter `shape` is multiplied by the values given in `disp`. As `shape` can be understood as a precision parameter (inverse of the variance), higher values will lead to higher weights in this case. Instead of using addition argument `disp`, you may equivalently use the distributional regression approach by specifying `sigma ~ 1 + offset(log(xdisp))` or `shape ~ 1 + offset(log(xdisp))`, where `xdisp` is the variable being passed to `disp`.

For families `binomial` and `zero_inflated_binomial`, addition should contain a variable indicating the number of trials underlying each observation. In `lme4` syntax, we may write for instance `cbind(success, n - success)`, which is equivalent to `success | trials(n)` in brms syntax. If the number of trials is constant across all observations, say `10`, we may also write `success | trials(10)`.

For all ordinal families, `aterms` may contain a term `cat(number)` to specify the number categories (e.g, `cat(7)`). If not given, the number of categories is calculated from the data.

With the expection of `categorical` and ordinal families, left, right, and interval censoring can be modeled through `y | cens(censored) ~ predictors`. The censoring variable (named `censored` in this example) should contain the values `'left'`, `'none'`, `'right'`, and `'interval'` (or equivalenty `-1`, `0`, `1`, and `2`) to indicate that the corresponding observation is left censored, not censored, right censored, or interval censored. For interval censored data, a second variable (let's call it `y2`) has to be passed to `cens`. In this case, the formula has the structure `y | cens(censored, y2) ~ predictors`. While the lower bounds are given in `y`, the upper bounds are given in `y2` for interval censored data. Intervals are assumed to be open on the left and closed on the right: `(y, y2]`.

With the expection of `categorical` and ordinal families, the response distribution can be truncated using the `trunc` function in the addition part. If the response variable is truncated between, say, 0 and 100, we can specify this via `yi | trunc(lb = 0, ub = 100) ~ predictors`. Instead of numbers, variables in the data set can also be passed allowing for varying truncation points across observations. Defining only one of the two arguments in `trunc` leads to one-sided truncation.

In Wiener diffusion models (family `wiener`) the addition term `dec` is mandatory to specify the (vector of) binary decisions corresponding to the reaction times. Non-zero values will be treated as a response on the upper boundary of the diffusion process and zeros will be treated as a response on the lower boundary. Alternatively, the variable passed to `dec` might also be a character vector consisting of `'lower'` and `'upper'`.

Mutiple addition terms may be specified at the same time using the `+` operator, for instance `formula = yi | se(sei) + cens(censored) ~ 1` for a censored meta-analytic model.

Formula syntax for multivariate and categorical models

For families `gaussian` and `student`, multivariate models may be specified using `cbind` notation. In brms 1.0.0, the multvariate 'trait' syntax was removed from the package as it repeatedly confused users, required much special case coding, and was hard to maintain. Below the new syntax is described. Suppose that `y1` and `y2` are response variables and `x` is a predictor. Then `cbind(y1,y2) ~ x` specifies a multivariate model, The effects of all terms specified at the RHS of the formula are assumed to vary across response variables (this was not the case by default in brms < 1.0.0). For instance, two parameters will be estimated for `x`, one for the effect on `y1` and another for the effect on `y2`. This is also true for group-level effects. When writing, for instance, `cbind(y1,y2) ~ x + (1+x|g)`, group-level effects will be estimated separately for each response. To model these effects as correlated across responses, use the ID syntax (see above). For the present example, this would look as follows: `cbind(y1,y2) ~ x + (1+x|2|g)`. Of course, you could also use any value other than `2` as ID. It is not yet possible to model terms as only affecting certain responses (and not others), but this will be implemented in the future.

Categorical models use the same syntax as multivariate models. As in most other implementations of categorical models, values of one category (the first in brms) are fixed to identify the model. Thus, all terms on the RHS of the formula correspond to `K - 1` effects (`K` = number of categories), one for each non-fixed category. Group-level effects may be specified as correlated across categories using the ID syntax.

As of brms 1.0.0, zero-inflated and hurdle models are specfied in the same way as as their non-inflated counterparts. However, they have additional distributional parameters (named `zi` and `hu` respectively) modeling the zero-inflation / hurdle probability depending on which model you choose. These parameters can also be affected by predictors in the same way the response variable itself. See the end of the Details section for information on how to accomplish that.

Parameterization of the population-level intercept

The population-level intercept (if incorporated) is estimated separately and not as part of population-level parameter vector `b`. As a result, priors on the intercept also have to be specified separately. Furthermore, to increase sampling efficiency, the population-level design matrix `X` is centered around its column means `X_means` if the intercept is incorporated. This leads to a temporary bias in the intercept equal to `<X_means, b>`, where `<,>` is the scalar product. The bias is corrected after fitting the model, but be aware that you are effectively defining a prior on the intercept of the centered design matrix not on the real intercept. For more details on setting priors on population-level intercepts, see `set_prior`.

This behavior can be avoided by using the reserved (and internally generated) variable `intercept`. Instead of `y ~ x`, you may write `y ~ 0 + intercept + x`. This way, priors can be defined on the real intercept, directly. In addition, the intercept is just treated as an ordinary population-level effect and thus priors defined on `b` will also apply to it. Note that this parameterization may be less efficient than the default parameterization discussed above.

Formula syntax for non-linear models

In brms, it is possible to specify non-linear models of arbitrary complexity. The non-linear model can just be specified within the `formula` argument. Suppose, that we want to predict the response `y` through the predictor `x`, where `x` is linked to `y` through `y = alpha - beta * lambda^x`, with parameters `alpha`, `beta`, and `lambda`. This is certainly a non-linear model being defined via `formula = y ~ alpha - beta * lambda^x` (addition arguments can be added in the same way as for ordinary formulas). To tell `brms` that this is a non-linear model, we set argument `nl` to `TRUE`. Now we have to specfiy a model for each of the non-linear parameters. Let's say we just want to estimate those three parameters with no further covariates or random effects. Then we can pass `alpha + beta + lambda ~ 1` or equivalently (and more flexible) `alpha ~ 1, beta ~ 1, lambda ~ 1` to the `...` argument. This can, of course, be extended. If we have another predictor `z` and observations nested within the grouping factor `g`, we may write for instance `alpha ~ 1, beta ~ 1 + z + (1|g), lambda ~ 1`. The formula syntax described above applies here as well. In this example, we are using `z` and `g` only for the prediction of `beta`, but we might also use them for the other non-linear parameters (provided that the resulting model is still scientifically reasonable).

Non-linear models may not be uniquely identified and / or show bad convergence. For this reason it is mandatory to specify priors on the non-linear parameters. For instructions on how to do that, see `set_prior`. For some examples of non-linear models, see `vignette("brms_nonlinear")`.

Formula syntax for predicting distributional parameters

It is also possible to predict parameters of the response distribution such as the residual standard deviation `sigma` in gaussian models or the hurdle probability `hu` in hurdle models. The syntax closely resembles that of a non-linear parameter, for instance `sigma ~ x + s(z) + (1+x|g)`. For some examples of distributional models, see `vignette("brms_distreg")`.

Alternatively, one may fix distributional parameters to certain values. However, this is mainly useful when models become too complicated and otherwise have convergence issues. We thus suggest to be generally careful when making use of this option. The `quantile` parameter of the `asym_laplace` distribution is a good example where it is useful. By fixing `quantile`, one can perform quantile regression for the specified quantile. For instance, `quantile = 0.25` allows predicting the 25%-quantile. Furthermore, the `bias` parameter in drift-diffusion models, is assumed to be `0.5` (i.e. no bias) in many applications. To achieve this, simply write `bias = 0.5`. Other possible applications are the Cauchy distribution as a special case of the Student-t distribution with `nu = 1`, or the geometric distribution as a special case of the negative binomial distribution with `shape = 1`. Furthermore, the parameter `disc` ('discrimination') in ordinal models is fixed to `1` by default and not estimated, but may be modeled as any other distributional parameter if desired (see examples). For reasons of identification, `'disc'` can only be positive, which is achieved by applying the log-link.

All distributional parameters currently supported by `brmsformula` have to positive (a negative standard deviation or precision parameter doesn't make any sense) or are bounded between 0 and 1 (for zero-inflated / hurdle proabilities, quantiles, or the intial bias parameter of drift-diffusion models). However, linear predictors can be positive or negative, and thus the log link (for positive parameters) or logit link (for probability parameters) are used by default to ensure that distributional parameters are within their valid intervals. This implies that, by default, effects for distributional parameters are estimated on the log / logit scale and one has to apply the inverse link function to get to the effects on the original scale. Alternatively, it is possible to use the identity link to predict parameters on their original scale, directly. However, this is much more likely to lead to problems in the model fitting.

See also `brmsfamily` for an overview of valid link functions.

Formula syntax for mixture models

The specification of mixture models closely resembles that of non-mixture models. If not specified otherwise (see below), all mean parameters of the mixture components are predicted using the right-hand side of `formula`. All types of predictor terms allowed in non-mixture models are allowed in mixture models as well.

distributional parameters of mixture distributions have the same name as those of the corresponding ordinary distributions, but with a number at the end to indicate the mixture component. For instance, if you use family `mixture(gaussian, gaussian)`, the distributional parameters are `sigma1` and `sigma2`. distributional parameters of the same class can be fixed to the same value. For the above example, we could write `sigma2 = "sigma1"` to make sure that both components have the same residual standard deviation, which is in turn estimated from the data.

In addition, there are two types of special distributional parameters. The first are named `mu<ID>`, that allow for modeling different predictors for the mean parameters of different mixture components. For instance, if you want to predict the mean of the first component using predictor `x` and the mean of the second component using predictor `z`, you can write `mu1 ~ x` as well as `mu2 ~ z`. The second are named `theta<ID>`, which constitute the mixing proportions. If the mixing proportions are fixed to certain values, they are internally normalized to form a probability vector. If one seeks to predict the mixing proportions, all but one of the them has to be predicted, while the remaining one is used as the reference category to identify the model. The `softmax` function is applied on the linear predictor terms to form a probability vector.

For more information on mixture models, see the documentation of `mixture`.

`brmsformula-helpers`

Examples

Run this code
``````# NOT RUN {
# multilevel model with smoothing terms
brmsformula(y ~ x1*x2 + s(z) + (1+x1|1) + (1|g2))

brmsformula(y ~ x1*x2 + s(z) + (1+x1|1) + (1|g2),
sigma ~ x1 + (1|g2))

# use the shorter alias 'bf'
(formula1 <- brmsformula(y ~ x + (x|g)))
(formula2 <- bf(y ~ x + (x|g)))
# will be TRUE
identical(formula1, formula2)

# incorporate censoring
bf(y | cens(censor_variable) ~ predictors)

# define a simple non-linear model
bf(y ~ a1 - a2^x, a1 + a2 ~ 1, nl = TRUE)

# predict a1 and a2 differently
bf(y ~ a1 - a2^x, a1 ~ 1, a2 ~ x + (x|g), nl = TRUE)

# correlated group-level effects across parameters
bf(y ~ a1 - a2^x, a1 ~ 1 + (1|2|g), a2 ~ x + (x|2|g), nl = TRUE)

# define a multivariate model
bf(cbind(y1, y2) ~ x * z + (1|g))

# define a zero-inflated model
# also predicting the zero-inflation part
bf(y ~ x * z + (1+x|ID1|g), zi ~ x + (1|ID1|g))

# specify a predictor as monotonic
bf(y ~ mo(x) + more_predictors)

# for ordinal models only
# specify a predictor as category specific
bf(y ~ cs(x) + more_predictors)
# add a category specific group-level intercept
bf(y ~ cs(x) + (cs(1)|g))
# specify parameter 'disc'
bf(y ~ person + item, disc ~ item)

# specify variables containing measurement error
bf(y ~ me(x, sdx))

# specify predictors on all parameters of the wiener diffusion model
# the main formula models the drift rate 'delta'
bf(rt | dec(decision) ~ x, bs ~ x, ndt ~ x, bias ~ x)

# fix the bias parameter to 0.5
bf(rt | dec(decision) ~ x, bias = 0.5)

# specify different predictors for different mixture components
mix <- mixture(gaussian, gaussian)
bf(y ~ 1, mu1 ~ x, mu2 ~ z, family = mix)

# fix both residual standard deviations to the same value
bf(y ~ x, sigma2 = "sigma1", family = mix)

# use the '+' operator to specify models
bf(y ~ 1) +
nlf(sigma ~ a * exp(b * x), a ~ x) +
lf(b ~ z + (1|g), dpar = "sigma") +
gaussian()

# }
``````

Run the code above in your browser using DataCamp Workspace