aov_b: Analysis of Variance using Bayesian methods

Description

Analysis of Variance using Bayesian methods

Usage

aov_b(
  formula,
  data,
  heteroscedastic = TRUE,
  prior_mean_mu,
  prior_mean_nu = 0.001,
  prior_var_shape = 0.001,
  prior_var_rate = 0.001,
  CI_level = 0.95,
  ROPE = 0.1,
  contrasts,
  improper = FALSE,
  seed = 1,
  mc_error = 0.002,
  compute_bayes_factor = TRUE
)

Value

Object of class "aov_b" with the following elements:

summary - tibble giving the summary of the model parameters
BF_for_different_vs_same_means - Bayes factor in favor of the full model (each group has their own mean) vs. the null model (all groups have the same mean).
pairwise summary - tibble giving the summary comparing all factor level means
contrasts (if provided) - list with named elements L (the contrasts provided by the user) and summary.
posterior_draws - mcmc object (see coda package) giving the posterior draws
posterior_parameters -
- mu_g - the post. means of the group means
- nu_g - the post. scalars of the precision
- a_g - (twice) the post. shape of the inv. gamma for the group variances
- b_g - (twice) the post. rate of the inv. gamma for the group variances.
hyperparameters -
- mu - the prior mean of the group means
- nu - the prior scalar of the precision
- a - (twice) the prior shape of the inv. gamma for the group variances
- b - (twice) the prior rate of the inv. gamma for the group variances.
fitted - Posterior mean of $\mu_g := \mathbb{E}(y_{gi})$
residuals - Posterior mean of the residuals
standardized_residuals - Estimated residuals divided by the group standard deviation
mc_error - absolute errors used to determine number of posterior draws for accurate interval estimation

Arguments

formula: A formula specifying the model.
data: A data frame in which the variables specified in the formula will be found. If missing, the variables are searched for in the standard way.
heteroscedastic: logical. Set to FALSE to assume all groups have equal variance.
prior_mean_mu: numeric. Hyperparameter for the a priori mean of the group means.
prior_mean_nu: numeric. Hyperparameter which scales the precision of the group means.
prior_var_shape: numeric. Twice the shape parameter for the inverse gamma prior on the residual variance(s). I.e., $\sigma^2\sim IG$(prior_var_shape/2,prior_var_rate/2).
prior_var_rate: numeric. Twice the rate parameter for the inverse gamma prior on the residual variance(s). I.e., $\sigma^2\sim IG$(prior_var_shape/2,prior_var_rate/2).
CI_level: numeric. Credible interval level.
ROPE: numeric. Used to compute posterior probability that Cohen's D +/- ROPE
contrasts: numeric/matrix. Either vector of length equal to the number of levels in the grouping variable, or else a matrix where each row is a separate contrast, and the number of columns match the number of levels in the grouping variable.
improper: logical. Should we use an improper prior that is proportional to the inverse of the variance?
seed: integer. Always set your seed!!!
mc_error: The number of posterior draws will ensure that with 99% probability the bounds of the credible intervals will be within $\pm$ mc_error$\times 4s_y$, that is, within 100mc_error% of the trimmed range of y.
compute_bayes_factor: logical. Computing the BF can be done analytically, but it requires an nxn matrix. If this will require more than 1GB of memory, compute_bayes_factor will automatically be set to FALSE. This setting can be overridden by setting compute_bayes_factor="force".

Details

MODEL: The likelihood model is given by $$ y_{gi} \overset{iid}{\sim} N(\mu_g,\sigma^2_g), $$ (although if heteroscedastic is set to FALSE, $\sigma^2_g=\sigma^2_h$ $\forall g,h$).

The prior is given by $$ \mu_g|\sigma^2_g \overset{iid}{\sim} N\left(\mu,\frac{\sigma^2_g}{\nu}\right), \\ \sigma^2_g \overset{iid}{\sim} \Gamma^{-1}(a/2,b/2), $$ where $mu$ is set by prior_mean_mu, $nu$ is set by prior_mean_nu, $a$ is set by prior_var_shape, and $b$ is set by prior_var_rate.

The posterior is $$ \mu_g|y,\sigma^2_g \overset{iid}{\sim} N\left(\hat\mu_g,\frac{\sigma^2_g}{\nu_g}\right), \\ \sigma^2_g|y \overset{iid}{\sim} \Gamma^{-1}(a_g/2,b_g/2), $$ where $\hat\mu_g$, $\nu_g$, $a_g$, and $b_g$ are all returned by aov_b in the named element posterior_parameters.

ROPE:

If missing, the ROPE bounds will be given under the principle of "half of a small effect size." Using Cohen's D of 0.2 as a small effect size, the ROPE is defined in terms of $-0.1 <$ Cohen's D $ < 0.1$.

References

Charles R. Doss, James M. Flegal, Galin L. Jones, Ronald C. Neath "Markov chain Monte Carlo estimation of quantiles," Electronic Journal of Statistics, Electron. J. Statist. 8(2), 2448-2478, (2014)

Examples

Run this code

# \donttest{
# Create data
set.seed(2025)
N = 500
test_data = 
  data.frame(x1 = rep(letters[1:5],N/5))
test_data$outcome = 
  rnorm(N,-1 + 2 * (test_data$x1 %in% c("d","e")) )

# Fit 1-way ANOVA model
fit1 <-
  aov_b(outcome ~ x1,
        test_data,
        prior_mean_mu = 2,
        prior_mean_nu = 0.5,
        prior_var_shape = 0.01,
        prior_var_rate = 0.01)
fit1
summary(fit1)
plot(fit1)
coef(fit1)
credint(fit1)
credint(fit1,
        CI_level = 0.99)
vcov(fit1)
fit1_predictions <- 
  predict(fit1,
          CI_level = 0.99,
          PI_level = 0.9)
AIC(fit1)
BIC(fit1)
DIC(fit1)
WAIC(fit1)

# Implement contrasts
## One contrast
fit2 <-
  aov_b(outcome ~ x1,
        test_data,
        mc_error = 0.01,
        contrasts = c(-1/3,-1/3,-1/3,1/2,1/2))
fit2$contrasts
summary(fit2)
## Multiple contrasts
fit3 <-
  aov_b(outcome ~ x1,
        test_data,
        mc_error = 0.01,
        contrasts = rbind(c(-1/3,-1/3,-1/3,1/2,1/2),
                          c(-1/3,-1/3,-1/3,1,0)))
fit3$contrasts
summary(fit3)
# }

Run the code above in your browser using DataLab