smoothtail-package: Smooth Estimation of GPD Shape Parameter

Description

Given independent and identically distributed observations $X_{1} < \dots < X_{n}$ from a Generalized Pareto distribution with shape parameter $γ \in [- 1, 0]$ , offers three methods to compute estimates of $γ$ . The estimates are based on the principle of replacing the order statistics $X_{(1)}, \dots, X_{(n)}$ of the sample by quantiles ${\hat{X}}_{(1)}, \dots, {\hat{X}}_{(n)}$ of the distribution function ${\hat{F}}_{n}$ based on the log--concave density estimator ${\hat{f}}_{n}$ . This procedure is justified by the fact that the GPD density is log--concave for $γ \in [- 1, 0]$ .

Arguments

Author

Kaspar Rufibach (maintainer), kaspar.rufibach@gmail.com ,
http://www.kasparrufibach.ch

Samuel Mueller, samuel.muller@mq.edu.au

Kaspar Rufibach acknowledges support by the Swiss National Science Foundation SNF, http://www.snf.ch

Details

Package:	smoothtail
Type:	Package
Version:	2.0.5
Date:	2016-07-12
License:	GPL (>=2)

Use this package to estimate the shape parameter $γ$ of a Generalized Pareto Distribution (GPD). In extreme value theory, $γ$ is denoted tail index. We offer three new estimators, all based on the fact that the density function of the GPD is log--concave if $γ \in [- 1, 0]$ , see Mueller and Rufibach (2009). The functions for estimation of the tail index are:

pickands
falk
falkMVUE
generalizedPick

This package depends on the package logcondens for estimation of a log--concave density: all the above functions take as first argument a dlc object as generated by logConDens in logcondens.

Additionally, functions for density, distribution function, quantile function and random number generation for a GPD with location parameter 0, shape parameter $γ$ and scale parameter $σ$ are provided:

dgpd
pgpd
qgpd
rgpd.

Let us shortly clarify what we mean with log--concave density estimation. Suppose we are given an ordered sample $Y_{1} < \dots < Y_{n}$ of i.i.d. random variables having density function $f$ , where $f = \exp φ$ for a concave function $φ : [- \infty, \infty) \to R$ . Following the development in Duembgen and Rufibach (2009), it is then possible to get an estimator ${\hat{f}}_{n} = \exp {\hat{φ}}_{n}$ of $f$ via the maximizer ${\hat{φ}}_{n}$ of

$L (φ) = \sum_{i = 1}^{n} φ (Y_{i}) - \int \exp φ (t) d t$

over all concave functions $φ$ . It turns out that ${\hat{φ}}_{n}$ is piecewise linear, with knots only at (some of the) observation points. Therefore, the infinite-dimensional optimization problem of finding the function ${\hat{φ}}_{n}$ boils down to a finite dimensional problem of finding the vector $({\hat{φ}}_{n} (Y_{1}), \dots, \hat{φ} (Y_{n}))$ . How to solve this problem is described in Rufibach (2006, 2007) and in a more general setting in Duembgen, Huesler, and Rufibach (2010). The distribution function based on ${\hat{f}}_{n}$ is defined as

${\hat{F}}_{n} (x) = \int_{Y_{1}}^{x} {\hat{f}}_{n} (t) d t$

for $x$ a real number. The definition of ${\hat{F}}_{n}$ is justified by the fact that ${\hat{F}}_{n} (Y_{1}) = 0$ .

References

Duembgen, L. and Rufibach, K. (2009) Maximum likelihood estimation of a log--concave density and its distribution function: basic properties and uniform consistency. Bernoulli, 15(1), 40--68.

Duembgen, L., Huesler, A. and Rufibach, K. (2010) Active set and EM algorithms for log-concave densities based on complete and censored data. Technical report 61, IMSV, Univ. of Bern, available at http://arxiv.org/abs/0707.4643.

Mueller, S. and Rufibach K. (2009). Smooth tail index estimation. J. Stat. Comput. Simul., 79, 1155--1167.

Mueller, S. and Rufibach K. (2008). On the max--domain of attraction of distributions with log--concave densities. Statist. Probab. Lett., 78, 1440--1444.

Rufibach K. (2006) Log-concave Density Estimation and Bump Hunting for i.i.d. Observations. PhD Thesis, University of Bern, Switzerland and Georg-August University of Goettingen, Germany, 2006.
Available at https://biblio.unibe.ch/download/eldiss/06rufibach_k.pdf.

Rufibach, K. (2007) Computing maximum likelihood estimators of a log-concave density function. J. Stat. Comput. Simul., 77, 561--574.

Examples

Run this code

# generate ordered random sample from GPD
set.seed(1977)
n <- 20
gam <- -0.75
x <- rgpd(n, gam)

# compute known endpoint
omega <- -1 / gam

# estimate log-concave density, i.e. generate dlc object
est <- logConDens(x, smoothed = FALSE, print = FALSE, gam = NULL, xs = NULL)

# plot distribution functions
s <- seq(0.01, max(x), by = 0.01)
plot(0, 0, type = 'n', ylim = c(0, 1), xlim = range(c(x, s))); rug(x)
lines(s, pgpd(s, gam), type = 'l', col = 2)
lines(x, 1:n / n, type = 's', col = 3)
lines(x, est$Fhat, type = 'l', col = 4)
legend(1, 0.4, c('true', 'empirical', 'estimated'), col = c(2 : 4), lty = 1)

# compute tail index estimators for all sensible indices k
falk.logcon <- falk(est)
falkMVUE.logcon <- falkMVUE(est, omega)
pick.logcon <- pickands(est)
genPick.logcon <- generalizedPick(est, c = 0.75, gam0 = -1/3)

# plot smoothed and unsmoothed estimators versus number of order statistics
plot(0, 0, type = 'n', xlim = c(0,n), ylim = c(-1, 0.2))
lines(1:n, pick.logcon[, 2], col = 1); lines(1:n, pick.logcon[, 3], col = 1, lty = 2)
lines(1:n, falk.logcon[, 2], col = 2); lines(1:n, falk.logcon[, 3], col = 2, lty = 2)
lines(1:n, falkMVUE.logcon[,2], col = 3); lines(1:n, falkMVUE.logcon[,3], col = 3, 
    lty = 2)
lines(1:n, genPick.logcon[, 2], col = 4); lines(1:n, genPick.logcon[, 3], col = 4, 
    lty = 2)
abline(h = gam, lty = 3)
legend(11, 0.2, c("Pickands", "Falk", "Falk MVUE", "Generalized Pickands'"), 
    lty = 1, col = 1:8)

Run the code above in your browser using DataLab

State of Data and AI Literacy Report 2025