Compute PIT values using the empirical CDF, then refine values in the tails by fitting a generalized Pareto distribution (GPD) to the tail draws. This gives smoother, more accurate PIT values in the tails where the ECDF is coarse, and avoids PIT values of 0 and 1. Due to use of generalized Pareto distribution CDF in tails, the PIT values are not anymore rank based and continuous uniformity test is appropriate.
pareto_pit(x, y, ...)# S3 method for default
pareto_pit(x, y, weights = NULL, log = FALSE, ndraws_tail = NULL, ...)
# S3 method for draws_matrix
pareto_pit(x, y, weights = NULL, log = FALSE, ndraws_tail = NULL, ...)
# S3 method for rvar
pareto_pit(x, y, weights = NULL, log = FALSE, ndraws_tail = NULL, ...)
A numeric vector of length length(y) containing the PIT values, or
an array of shape dim(y), if x is an rvar.
(draws) A draws_matrix object or one coercible to a
draws_matrix object, or an rvar object.
(observations) A 1D vector, or an array of dim(x), if x is rvar.
Each element of y corresponds to a variable in x.
Arguments passed to individual methods (if applicable).
A matrix of weights for each draw and variable. weights
should have one column per variable in x, and ndraws(x) rows.
(logical) Are the weights passed already on the log scale? The
default is FALSE, that is, expecting weights to be on the standard
(non-log) scale.
(integer) Number of tail draws to use for GPD
fitting. If NULL (the default), computed using ps_tail_length().
The function first computes raw PIT values identically to
pit() (including support for weighted draws). It then fits a
GPD to both tails of the draws (using the same approach as
pareto_smooth()) and replaces PIT values for observations falling in
the tail regions:
For a right-tail observation \(y_i > c_R\) (where \(c_R\) is the right-tail cutoff):
$$PIT(y_i) = 1 - p_{tail}(1 - F_{GPD}(y_i; c_R, \sigma_R, k_R))$$
For a left-tail observation \(y_i < c_L\):
$$PIT(y_i) = p_{tail}(1 - F_{GPD}(-y_i; -c_L, \sigma_L, k_L))$$
where \(p_{tail}\) is the proportion of (weighted) mass in the tail.
When (log-)weights in weights are provided, they are used for
the raw PIT computation (as in pit()) and for GPD fit.
pit() for the unsmoothed version, pareto_smooth() for
Pareto smoothing of draws.
x <- example_draws()
y <- rnorm(nvariables(x), 5, 5)
pareto_pit(x, y)
Run the code above in your browser using DataLab