This function chooses the \(\hat{\xi}_{k}\) and \(\hat \beta\) that minimize the negative log likelihood of the Generalized Pareto Distribution (GPD).
pot_estimator(data, u, start_xi = 0.1, start_beta = NULL, na.rm = FALSE)An unnamed numeric vector of length 2 containing the estimated Generalized Pareto Distribution (GPD) parameters that minimize the negative log likelihood: \(\xi\) (shape/tail index) and \(\beta\) (scale parameter).
A numeric vector of i.i.d. observations.
A numeric scalar that specifies the threshold value to calculate excesses
Initial value of \(\xi\) to pass to the optimizer
Initial value of \(\beta\) to pass to the optimizer
Logical. If TRUE, missing values (NA) are removed
before analysis. Defaults to FALSE.
The PDF of a excess data point \(x_i\) is given by:
$$f(x_i;\xi, \beta) = \frac{1}{\beta} \left(1 + \xi \frac{x_i}{\beta}\right)^{-\left(\frac{1}{\xi} + 1\right)}$$
If we apply \(log\) to the above equation we get:
$$l(x_i;\xi, \beta)=-\log(\beta) - (\frac{1}{\xi} + 1) \log(1 + \xi \frac{x_i}{\beta})$$
For all excess data points \(n\):
$$l(\xi,\beta)=\sum_{i=1}^{n} (-\log(\beta) - (\frac{1}{\xi} + 1) \log(1 + \xi \frac{x_i}{\beta}))$$
$$l(\xi,\beta)=-n\log(\beta) - (\frac{1}{\xi} + 1)\sum_{i=1}^{n} \log(1 + \xi \frac{x_i}{\beta})$$
We can thus minimize \(-l(\xi,\beta)\). The parameters \(\xi\) and \(\beta\) that minimize the negative log likelihood are the same that maximize the log likelihood. Hence, by using the excesses, we are able to determine \(\xi\) and \(\beta\) that best fit the tail of the data.
There is also the case to consider when \(\xi = 0\) which results in an exponential distribution. The total log likelihood in such a case is:
$$l(0, \beta) = -n \log(\beta) - \frac{1}{\beta} \sum_{i=1}^{n} x_i$$
Davison, A. C., & Smith, R. L. (1990). Models for exceedances over high thresholds. Journal of the Royal Statistical Society: Series B (Methodological), 52(3), 393-425. tools:::Rd_expr_doi("10.1111/j.2517-6161.1990.tb01796.x")
Balkema, A. A., & de Haan, L. (1974). Residual life time at great age. The Annals of Probability, 2(5), 792-804. tools:::Rd_expr_doi("10.1214/aop/1176996548")
Pickands, J. (1975). Statistical Inference Using Extreme Order Statistics. The Annals of Statistics, 3(1), 119–131. http://www.jstor.org/stable/2958083
Nair, J., Wierman, A., & Zwart, B. (2022). The Fundamentals of Heavy Tails: Properties, Emergence, and Estimation. Cambridge University Press. (pp. 221-226) tools:::Rd_expr_doi("10.1017/9781009053730")
x <- rweibull(n=800, shape = 0.8, scale = 1)
values <- pot_estimator(data = x, u = 2, start_xi = 0.1, start_beta = NULL)
Run the code above in your browser using DataLab