Learn R Programming

posterior (version 1.7.0)

pareto_pit: Pareto-smoothed probability integral transform

Description

Compute PIT values using the empirical CDF, then refine values in the tails by fitting a generalized Pareto distribution (GPD) to the tail draws. This gives smoother, more accurate PIT values in the tails where the ECDF is coarse, and avoids PIT values of 0 and 1. Due to use of generalized Pareto distribution CDF in tails, the PIT values are not anymore rank based and continuous uniformity test is appropriate.

Usage

pareto_pit(x, y, ...)

# S3 method for default pareto_pit(x, y, weights = NULL, log = FALSE, ndraws_tail = NULL, ...)

# S3 method for draws_matrix pareto_pit(x, y, weights = NULL, log = FALSE, ndraws_tail = NULL, ...)

# S3 method for rvar pareto_pit(x, y, weights = NULL, log = FALSE, ndraws_tail = NULL, ...)

Value

A numeric vector of length length(y) containing the PIT values, or an array of shape dim(y), if x is an rvar.

Arguments

x

(draws) A draws_matrix object or one coercible to a draws_matrix object, or an rvar object.

y

(observations) A 1D vector, or an array of dim(x), if x is rvar. Each element of y corresponds to a variable in x.

...

Arguments passed to individual methods (if applicable).

weights

A matrix of weights for each draw and variable. weights should have one column per variable in x, and ndraws(x) rows.

log

(logical) Are the weights passed already on the log scale? The default is FALSE, that is, expecting weights to be on the standard (non-log) scale.

ndraws_tail

(integer) Number of tail draws to use for GPD fitting. If NULL (the default), computed using ps_tail_length().

Details

The function first computes raw PIT values identically to pit() (including support for weighted draws). It then fits a GPD to both tails of the draws (using the same approach as pareto_smooth()) and replaces PIT values for observations falling in the tail regions:

For a right-tail observation \(y_i > c_R\) (where \(c_R\) is the right-tail cutoff):

$$PIT(y_i) = 1 - p_{tail}(1 - F_{GPD}(y_i; c_R, \sigma_R, k_R))$$

For a left-tail observation \(y_i < c_L\):

$$PIT(y_i) = p_{tail}(1 - F_{GPD}(-y_i; -c_L, \sigma_L, k_L))$$

where \(p_{tail}\) is the proportion of (weighted) mass in the tail.

When (log-)weights in weights are provided, they are used for the raw PIT computation (as in pit()) and for GPD fit.

See Also

pit() for the unsmoothed version, pareto_smooth() for Pareto smoothing of draws.

Examples

Run this code
x <- example_draws()
y <- rnorm(nvariables(x), 5, 5)
pareto_pit(x, y)

Run the code above in your browser using DataLab