loo.stanreg: Leave-one-out (LOO) and K-fold cross-validation

Description

For models fit using MCMC, compute approximate leave-one-out cross-validation (LOO) or, less preferably, the Widely Applicable Information Criterion (WAIC) using the loo package. Exact $K$-fold cross-validation is also available. Compare two or more models using the compare function.

Usage

"loo"(x, ..., k_threshold = NULL)
kfold(x, K = 10)
"waic"(x, ...)

Arguments

A fitted model object returned by one of the rstanarm modeling functions. See stanreg-objects.

...

Optional arguments to pass to psislw. Possible arguments and their defaults are:

We recommend using the default values for the psislw arguments unless there are problems (e.g. NA or NaN results).

k_threshold

Threshold for flagging estimates of the Pareto shape parameters $k$ estimated by loo. See the How to proceed when loo gives warnings section, below, for details.

The number of subsets of equal (if possible) size into which the data will be randomly partitioned for performing $K$-fold cross-validation. The model is refit K times, each time leaving out one of the K subsets. If K is equal to the total number of observations in the data then $K$-fold cross-validation is equivalent to exact leave-one-out cross-validation.

Value

An object of class 'loo'. See the 'Value' section in loo and waic for details on the structure of these objects. The object returned by kfold also has class 'kfold' in addition to 'loo'.

Approximate LOO CV

The loo method for stanreg objects provides an interface to the loo package for approximate leave-one-out cross-validation (LOO). The LOO Information Criterion (LOOIC) has the same purpose as the Akaike Information Criterion (AIC) that is used by frequentists. Both are intended to estimate the expected log predictive density (ELPD) for a new dataset. However, the AIC ignores priors and assumes that the posterior distribution is multivariate normal, whereas the functions from the loo package do not make this distributional assumption and integrate over uncertainty in the parameters. This only assumes that any one observation can be omitted without having a major effect on the posterior distribution, which can be judged using the diagnostic plot provided by the plot.loo method and the warnings provided by the print.loo method (see the How to Use the rstanarm Package vignette for an example of this process). How to proceed when loo gives warnings (k_threshold) The k_threshold argument to the loo method for rstanarm models is provided as a possible remedy when the diagnostics reveal problems stemming from the posterior's sensitivity to particular observations. Warnings about Pareto $k$ estimates indicate observations for which the approximation to LOO is problematic (this is described in detail in Vehtari, Gelman, and Gabry (2016) and the loo package documentation). The k_threshold argument can be used to set the $k$ value above which an observation is flagged. If k_threshold is not NULL and there are $J$ observations with $k$ estimates above k_threshold then when loo is called it will refit the original model $J$ times, each time leaving out one of the $J$ problematic observations. The pointwise contributions of these observations to the total ELPD are then computed directly and substituted for the previous estimates from these $J$ observations that are stored in the object created by loo. Note: in the warning messages issued by loo about large Pareto $k$ estimates we recommend setting k_threshold to at least $0.7$. There is a theoretical reason, explained in Vehtari, Gelman, and Gabry (2016), for setting the threshold to the stricter value of $0.5$, but in practice they find that errors in the LOO approximation start to increase non-negligibly when $k > 0.7$.

K-fold CV

The kfold function performs exact $K$-fold cross-validation. First the data are randomly partitioned into $K$ subsets of equal (or as close to equal as possible) size. Then the model is refit $K$ times, each time leaving out one of the K subsets. If $K$ is equal to the total number of observations in the data then $K$-fold cross-validation is equivalent to exact leave-one-out cross-validation (to which loo is an efficient approximation). The compare function is also compatible with the objects returned by kfold.

References

Vehtari, A., Gelman, A., and Gabry, J. (2016a). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and Computing. Advance online publication. doi:10.1007/s11222-016-9696-4. arXiv preprint: http://arxiv.org/abs/1507.04544/

Examples

Run this code


fit1 <- stan_glm(mpg ~ wt, data = mtcars)
fit2 <- stan_glm(mpg ~ wt + cyl, data = mtcars)

# compare on LOOIC
(loo1 <- loo(fit1, cores = 2))
loo2 <- loo(fit2, cores = 2)
compare(loo1, loo2)
plot(loo2)

# 10-fold cross-validation
(kfold1 <- kfold(fit1, K = 10))
kfold2 <- kfold(fit2, K = 10)
compare(kfold1, kfold2)

Run the code above in your browser using DataLab