weights.importance_sampling: Pareto smoothed importance sampling (PSIS)

Description

Implementation of Pareto smoothed importance sampling (PSIS), a method for stabilizing importance ratios. The version of PSIS implemented here corresponds to the algorithm presented in Vehtari, Gelman and Gabry (2017b). For PSIS diagnostics see the pareto-k-diagnostic page.

Usage

# S3 method for importance_sampling
weights(object, ..., log = TRUE, normalize = TRUE)
psis(log_ratios, ...)
# S3 method for array
psis(log_ratios, ..., r_eff = NULL, cores = getOption("mc.cores", 1))
# S3 method for matrix
psis(log_ratios, ..., r_eff = NULL, cores = getOption("mc.cores", 1))
# S3 method for default
psis(log_ratios, ..., r_eff = NULL)
is.psis(x)
is.sis(x)
is.tis(x)

Arguments

object

For the weights() method, an object returned by psis()/tis()/sis() (a list with same class name "psis"/"tis"/"sis".).

...

Arguments passed on to the various methods.

log

For the weights() method, should the weights be returned on the log scale? Defaults to TRUE.

normalize

For the weights() method, should the weights be normalized? Defaults to TRUE.

log_ratios

An array, matrix, or vector of importance ratios on the log scale (for PSIS-LOO these are negative log-likelihood values). See the Methods (by class) section below for a detailed description of how to specify the inputs for each method.

r_eff

Vector of relative effective sample size estimates containing one element per observation. The values provided should be the relative effective sample sizes of 1/exp(log_ratios) (i.e., 1/ratios). This is related to the relative efficiency of estimating the normalizing term in self-normalizing importance sampling. If r_eff is not provided then the reported PSIS effective sample sizes and Monte Carlo error estimates will be over-optimistic. See the relative_eff() helper function for computing r_eff. If using psis with draws of the log_ratios not obtained from MCMC then the warning message thrown when not specifying r_eff can be disabled by setting r_eff to NA.

cores

The number of cores to use for parallelization. This defaults to the option mc.cores which can be set for an entire R session by options(mc.cores = NUMBER). The old option loo.cores is now deprecated but will be given precedence over mc.cores until loo.cores is removed in a future release. As of version 2.0.0 the default is now 1 core if mc.cores is not set, but we recommend using as many (or close to as many) cores as possible.

Note for Windows 10 users: it is strongly recommended to avoid using the .Rprofile file to set mc.cores (using the cores argument or setting mc.cores interactively or in a script is fine).

For is.psis(), an object to check.

Value

The weights() method returns an object with the same dimensions as the log_weights component of the "psis"/"tis"/"sis" object. The normalize and log arguments control whether the returned weights are normalized and whether or not to return them on the log scale.

The psis() methods return an object of class "psis", which is a named list with the following components:

log_weights

Vector or matrix of smoothed (and truncated) but unnormalized log weights, minus the largest log ratio for numerical reasons. To get normalized weights use the weights method provided for objects of class "psis".

diagnostics

A named list containing two vectors:

pareto_k: Estimates of the shape parameter \(k\) of the generalized Pareto distribution. See the pareto-k-diagnostic page for details.
n_eff: PSIS effective sample size estimates.

Objects of class "psis" also have the following attributes:

norm_const_log: Vector of precomputed values of colLogSumExps(log_weights) that are used internally by the weights method to normalize the log weights.
tail_len: Vector of tail lengths used for fitting the generalized Pareto distribution.
r_eff: If specified, the user's r_eff argument.
dims: Integer vector of length 2 containing S (posterior sample size) and N (number of observations).
method: Method used for importance sampling, here psis.

Methods (by class)

array: An \(I\) by \(C\) by \(N\) array, where \(I\) is the number of MCMC iterations per chain, \(C\) is the number of chains, and \(N\) is the number of data points.
matrix: An \(S\) by \(N\) matrix, where \(S\) is the size of the posterior sample (with all chains merged) and \(N\) is the number of data points.
default: A vector of length \(S\) (posterior sample size).

References

Vehtari, A., Gelman, A., and Gabry, J. (2017a). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and Computing. 27(5), 1413--1432. doi:10.1007/s11222-016-9696-4 (journal version, preprint arXiv:1507.04544).

Vehtari, A., Gelman, A., and Gabry, J. (2017b). Pareto smoothed importance sampling. preprint arXiv:1507.02646

Examples

Run this code

# NOT RUN {
log_ratios <- -1 * example_loglik_array()
r_eff <- relative_eff(exp(-log_ratios))
psis_result <- psis(log_ratios, r_eff = r_eff)
str(psis_result)
plot(psis_result)

# extract smoothed weights
lw <- weights(psis_result) # default args are log=TRUE, normalize=TRUE
ulw <- weights(psis_result, normalize=FALSE) # unnormalized log-weights

w <- weights(psis_result, log=FALSE) # normalized weights (not log-weights)
uw <- weights(psis_result, log=FALSE, normalize = FALSE) # unnormalized weights



# }

Run the code above in your browser using DataLab