Learn R Programming

rstanarm (version 2.9.0-3)

pp_check: Graphical posterior predictive checks

Description

Various plots comparing the observed outcome variable $y$ to simulated datasets $y^{rep}$ from the posterior predictive distribution.

Usage

pp_check(object, check = "distributions", nreps = NULL, seed = NULL,
  overlay = TRUE, test = "mean", ...)

Arguments

object
A fitted model object returned by one of the rstanarm modeling functions. See stanreg-objects.
check
The type of plot (possibly abbreviated) to show. One of "distributions", "residuals", "scatter", "test". See Details for descriptions.
nreps
The number of $y^{rep}$ datasets to generate from the posterior predictive distribution (posterior_predict) and show in the plots. The default is nreps=3 for check="residuals"
seed
An optional seed to pass to posterior_predict.
overlay
For check="distributions" only, should distributions be plotted as density estimates overlaid in a single plot (TRUE, the default) or as separate histograms (FALSE)?
test
For check="test" only, a character vector (of length 1 or 2) naming a single function or a pair of functions. The function(s) should take a vector input and return a scalar test statistic. See Details and Examples.
...
Optional arguments to geoms to control features of the plots (e.g. binwidth if the plot is a histogram).

Value

  • A ggplot object that can be further customized using the ggplot2 package.

Details

Descriptions of the plots corresponding to the different values of check: [object Object],[object Object],[object Object],[object Object]

References

Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., and Rubin, D. B. (2013). Bayesian Data Analysis. Chapman & Hall/CRC Press, London, third edition. (Ch. 6)

See Also

posterior_predict for drawing from the posterior predictive distribution. Examples of posterior predictive checks can also be found in the rstanarm vignettes and demos.

Examples

Run this code
# Compare distribution of y to distributions of yrep
(pp_dist <- pp_check(example_model, check = "dist", overlay = TRUE))
pp_dist + 
 ggplot2::scale_color_manual(values = c("red", "black")) + # change colors
 ggplot2::scale_size_manual(values = c(0.5, 3)) + # change line sizes 
 ggplot2::scale_fill_manual(values = c(NA, NA)) # remove fill

# Check residuals
pp_check(example_model, check = "resid", nreps = 6)

# Check histograms of test statistics
test_mean <- pp_check(example_model, check = "test", test = "mean")
test_sd <- pp_check(example_model, check = "test", test = "sd")
gridExtra::grid.arrange(test_mean, test_sd, ncol = 2)

# Scatterplot of two test statistics
pp_check(example_model, check = "test", test = c("mean", "sd"))

# Scatterplots of y vs. yrep
fit <- stan_glm(mpg ~ wt, data = mtcars)
pp_check(fit, check = "scatter") # y vs. average yrep
pp_check(fit, check = "scatter", nreps = 3) # y vs. a few different yrep datasets 


# Defining a function to compute test statistic 
roaches$roach100 <- roaches$roach1 / 100
fit_pois <- stan_glm(y ~ treatment + roach100 + senior, offset = log(exposure2), 
                     family = "poisson", data = roaches)
fit_nb <- update(fit_pois, family = "neg_binomial_2")

prop0 <- function(y) mean(y == 0) # function to compute proportion of zeros
pp_check(fit_pois, check = "test", test = "prop0") # looks bad 
pp_check(fit_nb, check = "test", test = "prop0")   # much better

Run the code above in your browser using DataLab