This function computes the estimate of \(t\) and the associated confidence interval for \(t\) as well as \(alpha\), the corresponding shape parameter under the assumption of a Pareto model according to Klar (2024). Three methods are implemented to compute the confidence intervals: a method based on the unbiased variance estimators of the underlying U-statistics and two resampling methods (jackknife and bootstrap).
pareto_tail(
x,
u,
confint = FALSE,
method = c("unbiased", "bootstrap", "jackknife"),
R = 1000,
conf.level = 0.95,
alpha.max = 100
)A matrix containing:
The value of the threshold u.
Estimate of the tail functional t.
The lower bound of the confidence interval for t (if confint = TRUE).
The upper bound of the confidence interval for t (if confint = TRUE).
Estimate of the shape parameter under a Pareto model.
The lower bound of the confidence interval for alpha (if confint = TRUE).
The upper bound of the confidence interval for alpha (if confint = TRUE).
a vector containing the sample data.
the threshold for the computation of t.
a boolean value indicating whether the confidence interval should be computed.
the method used for computing the confidence intervals (options include unbiased variance estimator, jackknife, and bootstrap).
the number of the bootstrap replicates.
the confidence level for the interval.
the upper limit of the interval to be searched for the root in an internal routine (the default value of 100 should be increased in case of error).
In Klar (2024) the function $$ t_X(u) \;=\; \mathbb{E}\!\biggl[ \frac{\lvert X_1 - X_2 \rvert}{X_1 + X_2} \;\Big|\; \min\{X_1, X_2\} \,\ge u \biggr] $$ is proposed as a tool for detecting Pareto-type tails, where \(X_1, X_2, X\) are \(i.i.d.\) random variables from an absolutely continuous distribution supported on \([x_m,\infty)\). Theorem 1 in Klar (2024) shows that \(t_X(u)\) is constant in \(u\) if and only if \(X\) has a Pareto distribution.
The estimator \(\hat{t}_n\bigl(X_{(k)}\bigr)\) can be computed recursively. For \(k = 2,\ldots,n-1\),
$$ \hat{t}_n\bigl(X_{(k)}\bigr) \;=\; \frac{n-k+2}{n-k}\,\hat{t}_n\bigl(X_{(k-1)}\bigr) \;-\; \frac{1}{\binom{\,n-k+1\,}{2}} \sum_{j=k}^{n} \frac{X_{(j)} - X_{(k-1)}}{X_{(j)} + X_{(k-1)}}\,, $$
which can be evaluated efficiently starting from \(\hat{t}_n\bigl(X_{(n-1)}\bigr) = \bigl(X_{(n)} - X_{(n-1)}\bigl)/\bigl(X_{(n)} + X_{(n-1)}\bigl)\), where \(X_{(k)}\) denotes the \(k\)-th order statistic.
Confidence intervals for \(t(u)\) based on the following methods for variance estimation are also provided:
Unbiased variance estimator
Bootstrap resampling
Jackknife resampling
A two-sided \((1 - \gamma)\) confidence interval for the estimator \(\hat{t}_n(u)\) is : $$ \left[ \max\!\Bigl\{ \hat{t}_n(u) \;-\; z_{1 - \frac{\gamma}{2}} \,\frac{\hat{\sigma}_{u}}{ \sqrt{n\,U_n^{(2)}(u)} }, \;0 \Bigr\}, \, \min\!\Bigl\{ \hat{t}_n(u) \;+\; z_{1 - \frac{\gamma}{2}} \,\frac{\hat{\sigma}_{u}}{ \sqrt{n\,U_n^{(2)}(u)} }, \;1 \Bigr\} \right], $$ where \(z_{1 - \frac{\gamma}{2}} = \Phi^{-1}(1 - \tfrac{\gamma}{2})\) is the appropriate quantile of the standard normal distribution, \(\hat{\sigma}_u\) is an estimator of the standard deviation of \(c\,\hat{t}_n(u)\), for a constant c specified in section 4.1. of Klar (2024), and \(U_n^{(2)}(u)\) is a U-statistic given by $$ U_n^{(2)}(u) \;=\; \frac{2}{n\,(n-1)} \sum_{i = 1}^n (n - i) 1\{X_{(i)} \,\ge\, u\}. $$
Klar, B. (2024). A Pareto tail plot without moment restrictions. The American Statistician. tools:::Rd_expr_doi("https://doi.org/10.1080/00031305.2024.2413081")
x <- actuar::rpareto1(1e3, shape=1, min=1)
pareto_tail(x, round( quantile(x, c(0.1, 0.5, 0.75, 0.9, 0.95, 0.99)) ), confint = FALSE)
Run the code above in your browser using DataLab