Learn R Programming

VGAM (version 1.1-14)

gpd: Generalized Pareto Distribution Regression Family Function

Description

Maximum likelihood estimation of the 2-parameter generalized Pareto distribution (GPD).

Usage

gpd(threshold = 0, lscale = "loglink",
    lshape = "logofflink(offset = 0.5)",
    percentiles = c(90, 95), iscale = NULL, ishape = NULL,
    tolshape0 = 0.001, type.fitted = c("percentiles", "mean"),
    imethod = 1, zero = "shape")

Arguments

Value

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm

and vgam. However, for this VGAM family function, vglm

is probably preferred over vgam when there is smoothing.

Details

The distribution function of the GPD can be written $$G(y) = 1 - [1 + \xi (y-\mu) / \sigma ]_{+}^{- 1/ \xi} $$ where \(\mu\) is the location parameter (known, with value threshold), \(\sigma > 0\) is the scale parameter, \(\xi\) is the shape parameter, and \(h_+ = \max(h,0)\). The function \(1-G\) is known as the survivor function. The limit \(\xi \rightarrow 0\) gives the shifted exponential as a special case: $$G(y) = 1 - \exp[-(y-\mu)/ \sigma]. $$ The support is \(y>\mu\) for \(\xi>0\), and \(\mu < y <\mu-\sigma / \xi\) for \(\xi<0\).

Smith (1985) showed that if \(\xi <= -0.5\) then this is known as the nonregular case and problems/difficulties can arise both theoretically and numerically. For the (regular) case \(\xi > -0.5\) the classical asymptotic theory of maximum likelihood estimators is applicable; this is the default.

Although for \(\xi < -0.5\) the usual asymptotic properties do not apply, the maximum likelihood estimator generally exists and is superefficient for \(-1 < \xi < -0.5\), so it is ``better'' than normal. When \(\xi < -1\) the maximum likelihood estimator generally does not exist as it effectively becomes a two parameter problem.

The mean of \(Y\) does not exist unless \(\xi < 1\), and the variance does not exist unless \(\xi < 0.5\). So if you want to fit a model with finite variance use lshape = "extlogitlink".

References

Yee, T. W. and Stephenson, A. G. (2007). Vector generalized linear and additive extreme value models. Extremes, 10, 1--19.

Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. London: Springer-Verlag.

Smith, R. L. (1985). Maximum likelihood estimation in a class of nonregular cases. Biometrika, 72, 67--90.

See Also

rgpd, meplot, gev, paretoff, vglm, vgam, s.

Examples

Run this code
# Simulated data from an exponential distribution (xi = 0)
Threshold <- 0.5
gdata <- data.frame(y1 = Threshold + rexp(n = 3000, rate = 2))
fit <- vglm(y1 ~ 1, gpd(threshold = Threshold), data = gdata, trace = TRUE)
head(fitted(fit))
summary(depvar(fit))  # The original uncentred data
coef(fit, matrix = TRUE)  # xi should be close to 0
Coef(fit)
summary(fit)

head(fit@extra$threshold)  # Note the threshold is stored here

# Check the 90 percentile
ii <- depvar(fit) < fitted(fit)[1, "90%"]
100 * table(ii) / sum(table(ii))  # Should be 90%

# Check the 95 percentile
ii <- depvar(fit) < fitted(fit)[1, "95%"]
100 * table(ii) / sum(table(ii))  # Should be 95%

if (FALSE)  plot(depvar(fit), col = "blue", las = 1,
               main = "Fitted 90% and 95% quantiles")
matlines(1:length(depvar(fit)), fitted(fit), lty = 2:3, lwd = 2) 


# Another example
gdata <- data.frame(x2 = runif(nn <- 2000))
Threshold <- 0; xi <- exp(-0.8) - 0.5
gdata <- transform(gdata, y2 = rgpd(nn, scale = exp(1 + 0.1*x2), shape = xi))
fit <- vglm(y2 ~ x2, gpd(Threshold), data = gdata, trace = TRUE)
coef(fit, matrix = TRUE)


if (FALSE)  # Nonparametric fits
# Not so recommended:
fit1 <- vgam(y2 ~ s(x2), gpd(Threshold), data = gdata, trace = TRUE)
par(mfrow = c(2, 1))
plot(fit1, se = TRUE, scol = "blue")
# More recommended:
fit2 <- vglm(y2 ~ sm.bs(x2), gpd(Threshold), data = gdata, trace = TRUE)
plot(as(fit2, "vgam"), se = TRUE, scol = "blue") 

Run the code above in your browser using DataLab