glm_b: Bayesian Generalized Linear Models

Description

glm_b is used to fit linear models. It can be used to carry out regression for gaussian, binomial, and poisson data. Note that if the family is gaussian, this is just a wrapper for lm_b.

Usage

glm_b(
  formula,
  data,
  family,
  trials,
  prior = c("zellner", "normal", "improper")[1],
  zellner_g,
  prior_beta_mean,
  prior_beta_precision,
  prior_phi_mean = 1,
  ROPE,
  CI_level = 0.95,
  vb_maximum_iterations = 1000,
  algorithm = "VB",
  proposal_df = 5,
  seed = 1,
  mc_error = 0.01,
  save_memory = FALSE
)

Value

glm_b() returns an object of class "glm_b", which behaves as a list with the following elements:

summary - tibble giving results for regression coefficients
posterior_draws
ROPE
hyperparameters - list giving the user input or default hyperparameters used
fitted - posterior mean of the individuals' means
residuals - posterior mean of the residuals
If algorithm = "IS", the following:
- proposal_draws - draws from the importance sampling proposal distribution (i.e., multivariate t centered at the posterior mode with precision equal to the negative hessian, and degrees of freedom set to the user input proposal_df.
- importance_sampling_weights - importance sampling weights that match the rows of the returned proposal_draws.
- effective_sample_size
- mc_error
other inputs into glm_b

Importance sampling:

glm_b will, unless use_importance_sampling = FALSE, perform importance sampling. The proposal will use a multivariate t distribution, centered at the posterior mode, with the negative hessian as its precision matrix. Do NOT treat the proposal_draws as posterior draws.

Priors:

If the prior is set to be either "zellner" or "normal", a normal distribution will be used as the prior of $\beta$, specifically $$\beta \sim N(\mu, V)$$

where $\mu$ is the prior_beta_mean and V is the prior_beta_precision (not covariance) matrix.

zellner: glm_b sets $\mu=0$ and $V = \frac{1}{g} X^{\top} X$.
normal: If missing prior_beta_mean, glm_b sets $\mu=0$, and if missing prior_beta_precision V will be a diagonal matrix. The first element, corresponding to the intercept, will be $(2.5\times \max{\tilde{s}_y,1})^{-2}$, where $\tilde{s}_y$ is max of 1 and the standard deviation of $y$. Remaining diagonal elements will equal $(2.5 s_y/s_x)^{-2}$, where $s_x$ is the standard deviations of the covariates. This equates to being 95% certain a priori that a change in x by one standard deviation ($s_x$) would not lead to a change in the linear predictor of more than 5 standard deviations ($5s_y$). This imposes weak regularization that adapts to the scale of the data elements.

ROPE:

If missing, the ROPE bounds will be given under the principle of "half of a small effect size."

Gaussian. Using Cohen's D of 0.2 as a small effect size, the ROPE is built under the principle that moving the full range of X (i.e., $\pm 2 s_x$) will not move the mean of y by more than the overall mean of $y$ minus $0.1s_y$ to the overall mean of $y$ plus $0.1s_y$. The result is a ROPE equal to $|\beta_j| < 0.05s_y/s_j$. If the covariate is binary, then this is simply $|\beta_j| < 0.2s_y$.
Poisson. FDA guidance suggests a small effect is a rate ratio less than 1.25. We use half this effect: 1.125, and consider ROPE to indicate that a moving the full range of X ($\pm 2s_x$ will not change the rate ratio by more than this amount. Thus the ROPE for the regression coefficient equals $|\beta| < \frac{\log(1.125)}{4s_x}$. For binary covariates, this is simply $|\beta| < \log(1.125)$.

Arguments

formula: A formula specifying the model.
data: A data frame in which the variables specified in the formula will be found. If missing, the variables are searched for in the standard way. However, it is strongly recommended that you use this argument so that other generics for bayesics objects work correctly.
family: A description of the error distribution and link function to be used in the model. See ?glm for more information. Currently implemented families are binomial(), poisson(), negbinom(), and gaussian() (this last acts as a wrapper for lm_b. If missing family, glm_b will try to infer the data type; negative binomial will be used for count data.
trials: Either character naming the variable in data that corresponds to the number of trials in the binomial observations, or else an integer vector giving the number of trials for each observation.
prior: character. One of "zellner", "normal", or "improper", giving the type of prior used on the regression coefficients.
zellner_g: numeric. Positive number giving the value of "g" in Zellner's g prior. Ignored unless prior = "zellner". Default is the number of observations.
prior_beta_mean: numeric vector of same length as regression coefficients (denoted p). Unless otherwise specified, automatically set to rep(0,p). Ignored unless prior = "normal".
prior_beta_precision: pxp matrix giving a priori precision matrix to be scaled by the residual precision. Ignored unless prior = "normal".
prior_phi_mean: For negative binomial distributed outcomes, an exponential distribution is used for the prior of the dispersion parameter $phi$, parameterized such that $\text{Var}(y) = \mu + \frac{\mu^2}{\phi}$, so that the prior on $\phi$ is $\lambda e^{-\lambda \phi}$, where $\lambda$ equals $1/$prior_phi_mean.
ROPE: vector of positive values giving ROPE boundaries for each regression coefficient. Optionally, you can not include a ROPE boundary for the intercept. If missing, defaults go to those suggested by Kruchke (2018).
CI_level: numeric. Credible interval level.
vb_maximum_iterations: if algorithm = "VB", the number of iterations used in the fixed-form VB algorithm.
algorithm: Either "VB" (default) for fixed-form variational Bayes, "IS" for importance sampling, or "LSA" for large sample approximation.
proposal_df: degrees of freedom used in the multivariate t proposal distribution if algorithm = "IS".
seed: integer. Always set your seed!!! Not used for algorithm = LSA.
mc_error: If importance sampling is used, the number of posterior draws will ensure that with 99% probability the bounds of the credible intervals will be within $\pm$ mc_error.
save_memory: logical. If TRUE, a more memory efficient approach will be taken at the expense of computataional time (for important sampling only. But if memory is an issue, it's probably because you have a large sample size, in which case the normal approximation sans IS should probably work.)

References

Kruschke JK. Rejecting or Accepting Parameter Values in Bayesian Estimation. Advances in Methods and Practices in Psychological Science. 2018;1(2):270-280. doi:10.1177/2515245918771304

Tim Salimans. David A. Knowles. "Fixed-Form Variational Posterior Approximation through Stochastic Linear Regression." Bayesian Anal. 8 (4) 837 - 882, December 2013. https://doi.org/10.1214/13-BA858

Examples

Run this code

# \donttest{
# Generate some negative-binomial data
set.seed(2025)
N = 500
test_data =
  data.frame(x1 = rnorm(N),
             x2 = rnorm(N),
             x3 = letters[1:5],
             time = rexp(N))
test_data$outcome =
  rnbinom(N,
          mu = exp(-2 + test_data$x1 + 2 * (test_data$x3 %in% c("d","e"))) * test_data$time,
          size = 0.7)

# Fit using variational Bayes (default)
fit_vb1 <-
  glm_b(outcome ~ x1 + x2 + x3 + offset(log(time)),
        data = test_data,
        family = negbinom(),
        seed = 2025)
fit_vb1
summary(fit_vb1,
        CI_level = 0.9)
plot(fit_vb1)
coef(fit_vb1)
credint(fit_vb1,
        CI_level = 0.99)
bayes_factors(fit_vb1,
              by = "v")
preds =
  predict(fit_vb1)

# Try different priors
fit_vb2 <-
  glm_b(outcome ~ x1 + x2 + x3 + offset(log(time)),
        data = test_data,
        family = negbinom(),
        seed = 2025,
        prior = "normal")
fit_vb2
fit_vb3 <-
  glm_b(outcome ~ x1 + x2 + x3 + offset(log(time)),
        data = test_data,
        family = negbinom(),
        seed = 2025,
        prior = "improper")
fit_vb3

# Use Importance sampling instead of VB
fit_is <-
  glm_b(outcome ~ x1 + x2 + x3 + offset(log(time)),
        data = test_data,
        family = negbinom(),
        algorithm = "IS",
        seed = 2025)
summary(fit_is)

# Use large sample approximation instead of VB
fit_lsa <-
  glm_b(outcome ~ x1 + x2 + x3 + offset(log(time)),
        data = test_data,
        family = negbinom(),
        algorithm = "LSA",
        seed = 2025)
summary(fit_lsa)
# }

Run the code above in your browser using DataLab