50% off: Unlimited data and AI learning.
State of Data and AI Literacy Report 2025

smoothtail (version 2.0.6)

smoothtail-package: Smooth Estimation of GPD Shape Parameter

Description

Given independent and identically distributed observations X1<<Xn from a Generalized Pareto distribution with shape parameter γ[1,0], offers three methods to compute estimates of γ. The estimates are based on the principle of replacing the order statistics X(1),,X(n) of the sample by quantiles X^(1),,X^(n) of the distribution function F^n based on the log--concave density estimator f^n. This procedure is justified by the fact that the GPD density is log--concave for γ[1,0].

Arguments

Author

Kaspar Rufibach (maintainer), kaspar.rufibach@gmail.com ,
http://www.kasparrufibach.ch

Samuel Mueller, samuel.muller@mq.edu.au

Kaspar Rufibach acknowledges support by the Swiss National Science Foundation SNF, http://www.snf.ch

Details

Package:smoothtail
Type:Package
Version:2.0.5
Date:2016-07-12
License:GPL (>=2)

Use this package to estimate the shape parameter γ of a Generalized Pareto Distribution (GPD). In extreme value theory, γ is denoted tail index. We offer three new estimators, all based on the fact that the density function of the GPD is log--concave if γ[1,0], see Mueller and Rufibach (2009). The functions for estimation of the tail index are:

pickands
falk
falkMVUE
generalizedPick

This package depends on the package logcondens for estimation of a log--concave density: all the above functions take as first argument a dlc object as generated by logConDens in logcondens.

Additionally, functions for density, distribution function, quantile function and random number generation for a GPD with location parameter 0, shape parameter γ and scale parameter σ are provided:

dgpd
pgpd
qgpd
rgpd.

Let us shortly clarify what we mean with log--concave density estimation. Suppose we are given an ordered sample Y1<<Yn of i.i.d. random variables having density function f, where f=expφ for a concave function φ:[,)R. Following the development in Duembgen and Rufibach (2009), it is then possible to get an estimator f^n=expφ^n of f via the maximizer φ^n of

L(φ)=i=1nφ(Yi)expφ(t)dt

over all concave functions φ. It turns out that φ^n is piecewise linear, with knots only at (some of the) observation points. Therefore, the infinite-dimensional optimization problem of finding the function φ^n boils down to a finite dimensional problem of finding the vector (φ^n(Y1),,φ^(Yn)). How to solve this problem is described in Rufibach (2006, 2007) and in a more general setting in Duembgen, Huesler, and Rufibach (2010). The distribution function based on f^n is defined as

F^n(x)=Y1xf^n(t)dt

for x a real number. The definition of F^n is justified by the fact that F^n(Y1)=0.

References

Duembgen, L. and Rufibach, K. (2009) Maximum likelihood estimation of a log--concave density and its distribution function: basic properties and uniform consistency. Bernoulli, 15(1), 40--68.

Duembgen, L., Huesler, A. and Rufibach, K. (2010) Active set and EM algorithms for log-concave densities based on complete and censored data. Technical report 61, IMSV, Univ. of Bern, available at http://arxiv.org/abs/0707.4643.

Mueller, S. and Rufibach K. (2009). Smooth tail index estimation. J. Stat. Comput. Simul., 79, 1155--1167.

Mueller, S. and Rufibach K. (2008). On the max--domain of attraction of distributions with log--concave densities. Statist. Probab. Lett., 78, 1440--1444.

Rufibach K. (2006) Log-concave Density Estimation and Bump Hunting for i.i.d. Observations. PhD Thesis, University of Bern, Switzerland and Georg-August University of Goettingen, Germany, 2006.
Available at https://biblio.unibe.ch/download/eldiss/06rufibach_k.pdf.

Rufibach, K. (2007) Computing maximum likelihood estimators of a log-concave density function. J. Stat. Comput. Simul., 77, 561--574.

See Also

Package logcondens.

Examples

Run this code
# generate ordered random sample from GPD
set.seed(1977)
n <- 20
gam <- -0.75
x <- rgpd(n, gam)

# compute known endpoint
omega <- -1 / gam

# estimate log-concave density, i.e. generate dlc object
est <- logConDens(x, smoothed = FALSE, print = FALSE, gam = NULL, xs = NULL)

# plot distribution functions
s <- seq(0.01, max(x), by = 0.01)
plot(0, 0, type = 'n', ylim = c(0, 1), xlim = range(c(x, s))); rug(x)
lines(s, pgpd(s, gam), type = 'l', col = 2)
lines(x, 1:n / n, type = 's', col = 3)
lines(x, est$Fhat, type = 'l', col = 4)
legend(1, 0.4, c('true', 'empirical', 'estimated'), col = c(2 : 4), lty = 1)

# compute tail index estimators for all sensible indices k
falk.logcon <- falk(est)
falkMVUE.logcon <- falkMVUE(est, omega)
pick.logcon <- pickands(est)
genPick.logcon <- generalizedPick(est, c = 0.75, gam0 = -1/3)

# plot smoothed and unsmoothed estimators versus number of order statistics
plot(0, 0, type = 'n', xlim = c(0,n), ylim = c(-1, 0.2))
lines(1:n, pick.logcon[, 2], col = 1); lines(1:n, pick.logcon[, 3], col = 1, lty = 2)
lines(1:n, falk.logcon[, 2], col = 2); lines(1:n, falk.logcon[, 3], col = 2, lty = 2)
lines(1:n, falkMVUE.logcon[,2], col = 3); lines(1:n, falkMVUE.logcon[,3], col = 3, 
    lty = 2)
lines(1:n, genPick.logcon[, 2], col = 4); lines(1:n, genPick.logcon[, 3], col = 4, 
    lty = 2)
abline(h = gam, lty = 3)
legend(11, 0.2, c("Pickands", "Falk", "Falk MVUE", "Generalized Pickands'"), 
    lty = 1, col = 1:8)

Run the code above in your browser using DataLab