pp_check: Graphical posterior predictive checks

Description

Various plots comparing the observed outcome variable $y$ to simulated datasets $y^{rep}$ from the posterior predictive distribution.

Usage

pp_check(object, check = "distributions", nreps = NULL, seed = NULL,
  overlay = TRUE, test = "mean", ...)

Arguments

object

A fitted model object returned by one of the rstanarm modeling functions. See stanreg-objects.

check

The type of plot (possibly abbreviated) to show. One of "distributions", "residuals", "scatter", "test". See Details for descriptions.

nreps

The number of $y^{rep}$ datasets to generate from the posterior predictive distribution (posterior_predict) and show in the plots. The default is nreps=3 for check="residuals"

seed

An optional seed to pass to 
posterior_predict.

overlay

For check="distributions" only, should distributions be
plotted as density estimates overlaid in a single plot (TRUE, the 
default) or as separate histograms (FALSE)?

test

For check="test" only, a character vector (of length 1 or 
2) naming a single function or a pair of functions. The function(s) should 
take a vector input and return a scalar test statistic. See Details and
Examples.

...

Optional arguments to geoms to control features of the plots 
(e.g. binwidth if the plot is a histogram).

`Value`

A ggplot object that can be further customized using the
  ggplot2 package.

`Details`

Descriptions of the plots corresponding to the different values of 
check:
[object Object],[object Object],[object Object],[object Object]

`References`

Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari,
  A., and Rubin, D. B. (2013). Bayesian Data Analysis. Chapman & Hall/CRC
  Press, London, third edition. (Ch. 6)

`See Also`

posterior_predict for drawing from the posterior 
  predictive distribution. Examples of posterior predictive checks can also
  be found in the rstanarm vignettes and demos.

`Examples`

Run this code# Compare distribution of y to distributions of yrep
(pp_dist <- pp_check(example_model, check = "dist", overlay = TRUE))
pp_dist + 
 ggplot2::scale_color_manual(values = c("red", "black")) + # change colors
 ggplot2::scale_size_manual(values = c(0.5, 3)) + # change line sizes 
 ggplot2::scale_fill_manual(values = c(NA, NA)) # remove fill

# Check residuals
pp_check(example_model, check = "resid", nreps = 6)

# Check histograms of test statistics
test_mean <- pp_check(example_model, check = "test", test = "mean")
test_sd <- pp_check(example_model, check = "test", test = "sd")
gridExtra::grid.arrange(test_mean, test_sd, ncol = 2)

# Scatterplot of two test statistics
pp_check(example_model, check = "test", test = c("mean", "sd"))

# Scatterplots of y vs. yrep
fit <- stan_glm(mpg ~ wt, data = mtcars)
pp_check(fit, check = "scatter") # y vs. average yrep
pp_check(fit, check = "scatter", nreps = 3) # y vs. a few different yrep datasets 


# Defining a function to compute test statistic 
roaches$roach100 <- roaches$roach1 / 100
fit_pois <- stan_glm(y ~ treatment + roach100 + senior, offset = log(exposure2), 
                     family = "poisson", data = roaches)
fit_nb <- update(fit_pois, family = "neg_binomial_2")

prop0 <- function(y) mean(y == 0) # function to compute proportion of zeros
pp_check(fit_pois, check = "test", test = "prop0") # looks bad 
pp_check(fit_nb, check = "test", test = "prop0")   # much better
Run the code above in your browser using DataLab