tidy_stan: Tidy summary output for stan models

Description

Returns a tidy summary output for stan models.

Usage

tidy_stan(
  x,
  prob = 0.89,
  typical = "median",
  trans = NULL,
  effects = c("all", "fixed", "random"),
  component = c("all", "conditional", "zero_inflated", "zi"),
  digits = 2
)

Arguments

A stanreg, stanfit or brmsfit object.

prob

Vector of scalars between 0 and 1, indicating the mass within the credible interval that is to be estimated.

typical

The typical value that will represent the Bayesian point estimate. By default, the posterior median is returned. See typical_value for possible values for this argument.

trans

Name of a function or character vector naming a function, used to apply transformations on the estimates and uncertainty intervals. The values for standard errors are not transformed! If trans is not NULL, credible intervals instead of HDI are computed, due to the possible asymmetry of the HDI.

effects

Should results for fixed effects, random effects or both be returned? Only applies to mixed models. May be abbreviated.

component

Should results for all parameters, parameters for the conditional model or the zero-inflated part of the model be returned? May be abbreviated. Only applies to brms-models.

digits

Amount of digits to round numerical values in the output.

Value

A data frame, summarizing x, with consistent column names. To distinguish multiple HDI values, column names for the HDI get a suffix when prob has more than one element.

Details

The returned data frame has an additonal class-attribute, tidy_stan, to pass the result to its own print()-method. The print()-method creates a cleaner output, especially for multilevel, zero-inflated or multivariate response models, where - for instance - the conditional part of a model is printed separately from the zero-inflated part, or random and fixed effects are printed separately.

The returned data frame gives information on:

The Bayesian point estimate (column estimate, which is by default the posterior median; other statistics are also possible, see argument typical).
The standard error (which is actually the median absolute deviation).
The HDI. Computation for HDI is based on the code from Kruschke 2015, pp. 727f.
The Probability of Direction (pd), which is an index for "effect significance" (see Makowski et al. 2019). A value of 95% or higher indicates a "significant" (i.e. statistically clear) effect.
The effective numbers of samples, ESS.
The Rhat statistics. When Rhat is above 1, it usually indicates that the chain has not yet converged, indicating that the drawn samples might not be trustworthy. Drawing more iteration may solve this issue.
The Monte Carlo standard error (see mcse). It is defined as standard deviation of the chains divided by their effective sample size and “provides a quantitative suggestion of how big the estimation noise is” (Kruschke 2015, p.187).

References

Kruschke JK. Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan 2nd edition. Academic Press, 2015

Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB. Bayesian data analysis 3rd ed. Boca Raton: Chapman and Hall/CRC, 2013

Gelman A, Rubin DB. Inference from iterative simulation using multiple sequences Statistical Science 1992;7: 457-511

Makowski D, Ben-Shachar MS, L<U+00FC>decke D. bayestestR: Describing Effects and their Uncertainty, Existence and Significance within the Bayesian Framework. Journal of Open Source Software 2019;4:1541. 10.21105/joss.01541

McElreath R. Statistical Rethinking. A Bayesian Course with Examples in R and Stan Chapman and Hall, 2015

Examples

Run this code

# NOT RUN {
if (require("rstanarm")) {
  fit <- stan_glm(mpg ~ wt + am, data = mtcars, chains = 1)
  tidy_stan(fit)
  tidy_stan(fit, prob = c(.89, .5))
}
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab