tidy_stan: Tidy summary output for stan models

Description

Returns a tidy summary output for stan models.

Usage

tidy_stan(x, prob = 0.89, typical = "median", trans = NULL,
  type = c("fixed", "random", "all"), digits = 2)

Arguments

A stanreg, stanfit or brmsfit object.

prob

Vector of scalars between 0 and 1, indicating the mass within the credible interval that is to be estimated.

typical

The typical value that will represent the Bayesian point estimate. By default, the posterior median is returned. See typical_value for possible values for this argument.

trans

Name of a function or character vector naming a function, used to apply transformations on the estimates and uncertainty intervals. The values for standard errors are not transformed! If trans is not NULL, credible intervals instead of HDI are computed, due to the possible asymmetry of the HDI.

type

For mixed effects models, specify the type of effects that should be returned. type = "fixed" returns fixed effects only, type = "random" the random effects and type = "all" returns both fixed and random effects.

digits

Amount of digits to round numerical values in the output.

Value

A tidy data frame, summarizing x, with consistent column names. To distinguish multiple HDI values, column names for the HDI get a suffix when prob has more than one element.

Details

The returned data frame has an additonal class-attribute, tidy_stan, to pass the result to its own print()-method. The print()-method create a cleaner output, especially for multilevel, zero-inflated or multivariate response models, where - for instance - the conditional part of a model is printed separately from the zero-inflated part, or random and fixed effects are printed separately.

The returned data frame gives information on:

The Bayesian point estimate (column estimate, which is by default the posterior median; other statistics are also possible, see argument typical).
The standard error (which is actually the median absolute deviation).
The HDI. Computation for HDI is based on the code from Kruschke 2015, pp. 727f.
The ratio of effective numbers of samples, neff_ratio, (i.e. effective number of samples divided by total number of samples). This ratio ranges from 0 to 1, and should be close to 1. The closer this ratio comes to zero means that the chains may be inefficient, but possibly still okay.
The Rhat statistics. When Rhat is above 1, it usually indicates that the chain has not yet converged, indicating that the drawn samples might not be trustworthy. Drawing more iteration may solve this issue.
The Monte Carlo standard error (see mcse). It is defined as standard deviation of the chains divided by their effective sample size and “provides a quantitative suggestion of how big the estimation noise is” (Kruschke 2015, p.187).

References

Kruschke JK. Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan 2nd edition. Academic Press, 2015

Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB. Bayesian data analysis 3rd ed. Boca Raton: Chapman and Hall/CRC, 2013

Gelman A, Rubin DB. Inference from iterative simulation using multiple sequences Statistical Science 1992;7: 457-511

McElreath R. Statistical Rethinking. A Bayesian Course with Examples in R and Stan Chapman and Hall, 2015

Examples

Run this code

# NOT RUN {
if (require("rstanarm")) {
  fit <- stan_glm(mpg ~ wt + am, data = mtcars, chains = 1)
  tidy_stan(fit)
  tidy_stan(fit, prob = c(.89, .5))
}
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab