# loo_model_weights

##### Model averaging/weighting via stacking or pseudo-BMA weighting

Model averaging via stacking of predictive distributions, pseudo-BMA weighting or pseudo-BMA+ weighting with the Bayesian bootstrap. See Yao et al. (2018) and Vehtari, Gelman, and Gabry (2017a,2017b) for background.

##### Usage

`loo_model_weights(x, ...)`# S3 method for default
loo_model_weights(x, ..., method = c("stacking",
"pseudobma"), optim_method = "BFGS", optim_control = list(), BB = TRUE,
BB_n = 1000, alpha = 1, r_eff_list = NULL,
cores = getOption("mc.cores", 1))

stacking_weights(lpd_point, optim_method = "BFGS", optim_control = list())

pseudobma_weights(lpd_point, BB = TRUE, BB_n = 1000, alpha = 1)

##### Arguments

- x
A list of pointwise log-likelihood matrices or "psis_loo" objects (objects returned by

`loo`

), one for each model. Each matrix/object should have dimensions \(S\) by \(N\), where \(S\) is the size of the posterior sample (with all chains merged) and \(N\) is the number of data points. If`x`

is a list of log-likelihood matrices then`loo`

is called internally on each matrix. Currently the`loo_model_weights`

function is not implemented to be used with results from K-fold CV, but you can still obtain weights using K-fold CV results by calling the`stacking_weights`

function directly.- ...
Unused, except for the generic to pass arguments to individual methods.

- method
Either

`"stacking"`

or`"pseudobma"`

, indicating which method to use for obtaining the weights.`"stacking"`

refers to stacking of predictive distributions and`"pseudobma"`

refers to pseudo-BMA+ weighting (or plain pseudo-BMA weighting if`BB`

is`FALSE`

).- optim_method
The optimization method to use if

`method="stacking"`

. It can be chosen from "Nelder-Mead", "BFGS", "CG", "L-BFGS-B", "SANN" and "Brent". The default method is "BFGS".- optim_control
If

`method="stacking"`

, a list of control parameters for optimization. See`constrOptim`

for details.- BB
Logical used when

`"method"`

=`"pseudobma"`

. If`TRUE`

(the default), the Bayesian bootstrap will be used to adjust the pseudo-BMA weighting, which is called pseudo-BMA+ weighting. It helps regularize the weight away from 0 and 1, so as to reduce the variance.- BB_n
For pseudo-BMA+ weighting only, the number of samples to use for the Bayesian bootstrap. The default is 1000.

- alpha
Positive scalar shape parameter in the Dirichlet distribution used for the Bayesian bootstrap. The default is 1, which corresponds to a uniform distribution on the simplex space.

- r_eff_list
Optionally, a list of relative effective sample size estimates for the likelihood

`(exp(log_lik))`

of each observation in each model. See`psis`

and`relative_eff`

helper function for computing`r_eff`

. If`x`

is a list of "psis_loo" objects then`r_eff_list`

is ignored.- cores
The number of cores to use for parallelization. This defaults to the option

`mc.cores`

which can be set for an entire R session by`options(mc.cores = NUMBER)`

. The old option`loo.cores`

is now deprecated but will be given precedence over`mc.cores`

until`loo.cores`

is removed in a future release.**As of version 2.0.0 the default is now 1 core if**`mc.cores`

is not set, but we recommend using as many (or close to as many) cores as possible.- lpd_point
A matrix of pointwise leave-one-out (or K-fold) log likelihoods evaluated for different models. It should be a \(N\) by \(K\) matrix where \(N\) is sample size and \(K\) is the number of models. Each column corresponds to one model. These values can be calculated approximately using

`loo`

or by running exact leave-one-out or K-fold cross-validation.

##### Details

`loo_model_weights`

is a wrapper around the `stacking_weights`

and
`pseudobma_weights`

functions that implements stacking, pseudo-BMA, and
pseudo-BMA+ weighting for combining multiple predictive distributions. We can
use approximate or exact leave-one-out cross-validation (LOO-CV) or K-fold CV
to estimate the expected log predictive density (ELPD).

The stacking method (`method="stacking"`

) combines all models by
maximizing the leave-one-out predictive density of the combination
distribution. That is, it finds the optimal linear combining weights for
maximizing the leave-one-out log score.

The pseudo-BMA method (`method="pseudobma"`

) finds the relative weights
proportional to the ELPD of each model. However, when
`method="pseudobma"`

, the default is to also use the Bayesian bootstrap
(`BB=TRUE`

), which corresponds to the pseudo-BMA+ method. The Bayesian
bootstrap takes into account the uncertainty of finite data points and
regularizes the weights away from the extremes of 0 and 1.

In general, we recommend stacking for averaging predictive distributions, while pseudo-BMA+ can serve as a computationally easier alternative.

##### Value

A numeric vector containing one weight for each model.

##### References

Vehtari, A., Gelman, A., and Gabry, J. (2017a). Practical
Bayesian model evaluation using leave-one-out cross-validation and WAIC.
*Statistics and Computing*. 27(5), 1413--1432.
doi:10.1007/s11222-016-9696-4.
(published
version, arXiv preprint).

Vehtari, A., Gelman, A., and Gabry, J. (2017b). Pareto smoothed importance sampling. arXiv preprint: http://arxiv.org/abs/1507.02646/

Yao, Y., Vehtari, A., Simpson, D., and Gelman, A. (2018) Using
stacking to average Bayesian predictive distributions. *Bayesian
Analysis*, advance publication, doi:10.1214/17-BA1091.
(online).

##### See Also

The loo package vignettes for demonstrations.

`loo`

for details on leave-one-out ELPD estimation.`constrOptim`

for the choice of optimization methods and control-parameters.`relative_eff`

for computing`r_eff`

.

##### Examples

```
# NOT RUN {
### Demonstrating usage after fitting models with RStan
library(rstan)
# generate fake data from N(0,1).
N <- 100
y <- rnorm(N, 0, 1)
# Suppose we have three models: N(-1, sigma), N(0.5, sigma) and N(0.6,sigma).
stan_code <- "
data {
int N;
vector[N] y;
real mu_fixed;
}
parameters {
real<lower=0> sigma;
}
model {
sigma ~ exponential(1);
y ~ normal(mu_fixed, sigma);
}
generated quantities {
vector[N] log_lik;
for (n in 1:N) log_lik[n] = normal_lpdf(y[n]| mu_fixed, sigma);
}"
mod <- stan_model(model_code = stan_code)
fit1 <- sampling(mod, data=list(N=N, y=y, mu_fixed=-1))
fit2 <- sampling(mod, data=list(N=N, y=y, mu_fixed=0.5))
fit3 <- sampling(mod, data=list(N=N, y=y, mu_fixed=0.6))
model_list <- list(fit1, fit2, fit3)
log_lik_list <- lapply(model_list, extract_log_lik)
# optional but recommended
r_eff_list <- lapply(model_list, function(x) {
ll_array <- extract_log_lik(x, merge_chains = FALSE)
relative_eff(exp(ll_array))
})
# stacking method:
wts1 <- loo_model_weights(
log_lik_list,
method = "stacking",
r_eff_list = r_eff_list,
optim_control = list(reltol=1e-10)
)
print(wts1)
# can also pass a list of psis_loo objects to avoid recomputing loo
loo_list <- lapply(1:length(log_lik_list), function(j) {
loo(log_lik_list[[j]], r_eff = r_eff_list[[j]])
})
wts2 <- loo_model_weights(
loo_list,
method = "stacking",
optim_control = list(reltol=1e-10)
)
all.equal(wts1, wts2)
# pseudo-BMA+ method:
set.seed(1414)
loo_model_weights(loo_list, method = "pseudobma")
# pseudo-BMA method (set BB = FALSE):
loo_model_weights(loo_list, method = "pseudobma", BB = FALSE)
# calling stacking_weights or pseudobma_weights directly
lpd1 <- loo(log_lik_list[[1]], r_eff = r_eff_list[[1]])$pointwise[,1]
lpd2 <- loo(log_lik_list[[2]], r_eff = r_eff_list[[2]])$pointwise[,1]
lpd3 <- loo(log_lik_list[[3]], r_eff = r_eff_list[[3]])$pointwise[,1]
stacking_weights(cbind(lpd1, lpd2, lpd3))
pseudobma_weights(cbind(lpd1, lpd2, lpd3))
pseudobma_weights(cbind(lpd1, lpd2, lpd3), BB = FALSE)
# }
# NOT RUN {
# }
```

*Documentation reproduced from package loo, version 2.0.0, License: GPL (>= 3)*