compare
function.
"loo"(x, ..., k_threshold = NULL)
kfold(x, K = 10)
"waic"(x, ...)
stanreg-objects
.psislw
. Possible
arguments and their defaults are:
We recommend using the default values for the psislw
arguments unless
there are problems (e.g. NA
or NaN
results).
loo
. See the How to proceed
when loo
gives warnings section, below, for details.K
times, each time leaving out
one of the K
subsets. If K
is equal to the total number of
observations in the data then $K$-fold cross-validation is equivalent
to exact leave-one-out cross-validation.loo
method for stanreg objects provides an interface to
the loo package for approximate leave-one-out
cross-validation (LOO). The LOO Information Criterion (LOOIC) has the same
purpose as the Akaike Information Criterion (AIC) that is used by
frequentists. Both are intended to estimate the expected log predictive
density (ELPD) for a new dataset. However, the AIC ignores priors and assumes
that the posterior distribution is multivariate normal, whereas the functions
from the loo package do not make this distributional assumption and
integrate over uncertainty in the parameters. This only assumes that any one
observation can be omitted without having a major effect on the posterior
distribution, which can be judged using the diagnostic plot provided by the
plot.loo
method and the warnings provided by the
print.loo
method (see the How to Use the rstanarm
Package vignette for an example of this process). How to proceed when loo
gives warnings (k_threshold)
The k_threshold
argument to the loo
method for rstanarm
models is provided as a possible remedy when the diagnostics reveal problems
stemming from the posterior's sensitivity to particular observations.
Warnings about Pareto $k$ estimates indicate observations for which the
approximation to LOO is problematic (this is described in detail in Vehtari,
Gelman, and Gabry (2016) and the loo package
documentation). The k_threshold
argument can be used to set the
$k$ value above which an observation is flagged. If k_threshold
is
not NULL
and there are $J$ observations with $k$ estimates
above k_threshold
then when loo
is called it will refit the
original model $J$ times, each time leaving out one of the $J$
problematic observations. The pointwise contributions of these observations
to the total ELPD are then computed directly and substituted for the previous
estimates from these $J$ observations that are stored in the object
created by loo
. Note: in the warning messages issued by loo
about large
Pareto $k$ estimates we recommend setting k_threshold
to at least
$0.7$. There is a theoretical reason, explained in Vehtari, Gelman, and
Gabry (2016), for setting the threshold to the stricter value of $0.5$,
but in practice they find that errors in the LOO approximation start to
increase non-negligibly when $k > 0.7$. kfold
function performs exact $K$-fold cross-validation. First
the data are randomly partitioned into $K$ subsets of equal (or as close
to equal as possible) size. Then the model is refit $K$ times, each time
leaving out one of the K
subsets. If $K$ is equal to the total
number of observations in the data then $K$-fold cross-validation is
equivalent to exact leave-one-out cross-validation (to which loo
is an
efficient approximation). The compare
function is also compatible with
the objects returned by kfold
.compare
for comparing two or more models on LOO, WAIC, or
$K$-fold CV.The various rstanarm vignettes for more examples of using loo
.
loo-package
(in particular the PSIS-LOO section)
for details on the computations implemented by the loo package and the
interpretation of the Pareto $k$ estimates displayed when using the
plot.loo
method.
log_lik.stanreg
to directly access the pointwise log-likelihood
matrix.
fit1 <- stan_glm(mpg ~ wt, data = mtcars)
fit2 <- stan_glm(mpg ~ wt + cyl, data = mtcars)
# compare on LOOIC
(loo1 <- loo(fit1, cores = 2))
loo2 <- loo(fit2, cores = 2)
compare(loo1, loo2)
plot(loo2)
# 10-fold cross-validation
(kfold1 <- kfold(fit1, K = 10))
kfold2 <- kfold(fit2, K = 10)
compare(kfold1, kfold2)
Run the code above in your browser using DataLab