Learn R Programming

VGAM (version 1.1-14)

gumbel: Gumbel Regression Family Function

Description

Maximum likelihood estimation of the 2-parameter Gumbel distribution.

Usage

gumbel(llocation = "identitylink", lscale = "loglink",
       iscale = NULL, R = NA, percentiles = c(95, 99),
       mpv = FALSE, zero = NULL)
gumbelff(llocation = "identitylink", lscale = "loglink",
         iscale = NULL, R = NA, percentiles = c(95, 99),
         zero = "scale", mpv = FALSE)

Arguments

Value

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm, and vgam.

Details

The Gumbel distribution is a generalized extreme value (GEV) distribution with shape parameter \(\xi = 0\). Consequently it is more easily estimated than the GEV. See gev for more details.

The quantity \(R\) is the maximum number of observations possible, for example, in the Venice data below, the top 10 daily values are recorded for each year, therefore \(R = 365\) because there are about 365 days per year. The MPV is the value of the response such that the probability of obtaining a value greater than the MPV is 0.5 out of \(R\) observations. For the Venice data, the MPV is the sea level such that there is an even chance that the highest level for a particular year exceeds the MPV. If mpv = TRUE then the column labelled "MPV" contains the MPVs when fitted() is applied to the fitted object.

The formula for the mean of a response \(Y\) is \(\mu+\sigma \times Euler\) where \(Euler\) is a constant that has value approximately equal to 0.5772. The formula for the percentiles are (if R is not given) \(\mu-\sigma \times \log[-\log(P/100)]\) where \(P\) is the percentile argument value(s). If R is given then the percentiles are \(\mu-\sigma \times \log[R(1-P/100)]\).

References

Yee, T. W. and Stephenson, A. G. (2007). Vector generalized linear and additive extreme value models. Extremes, 10, 1--19.

Smith, R. L. (1986). Extreme value theory based on the r largest annual events. Journal of Hydrology, 86, 27--43.

Rosen, O. and Cohen, A. (1996). Extreme percentile regression. In: Haerdle, W. and Schimek, M. G. (eds.), Statistical Theory and Computational Aspects of Smoothing: Proceedings of the COMPSTAT '94 Satellite Meeting held in Semmering, Austria, 27--28 August 1994, pp.200--214, Heidelberg: Physica-Verlag.

Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. London: Springer-Verlag.

See Also

rgumbel, dgumbelII, cens.gumbel, guplot, gev, gevff, venice.

Examples

Run this code
# Example 1: Simulated data
gdata <- data.frame(y1 = rgumbel(n = 1000, loc = 100, scale = exp(1)))
fit1 <- vglm(y1 ~ 1, gumbelff(perc = NULL), data = gdata, trace = TRUE)
coef(fit1, matrix = TRUE)
Coef(fit1)
head(fitted(fit1))
with(gdata, mean(y1))

# Example 2: Venice data
(fit2 <- vglm(cbind(r1, r2, r3, r4, r5) ~ year, data = venice,
              gumbel(R = 365, mpv = TRUE), trace = TRUE))
head(fitted(fit2))
coef(fit2, matrix = TRUE)
sqrt(diag(vcov(summary(fit2))))   # Standard errors

# Example 3: Try a nonparametric fit ---------------------
# Use the entire data set, including missing values
# Same as as.matrix(venice[, paste0("r", 1:10)]):
Y <- Select(venice, "r", sort = FALSE)
fit3 <- vgam(Y ~ s(year, df = 3), gumbel(R = 365, mpv = TRUE),
             data = venice, trace = TRUE, na.action = na.pass)
depvar(fit3)[4:5, ]  # NAs used to pad the matrix

if (FALSE)   # Plot the component functions
par(mfrow = c(2, 3), mar = c(6, 4, 1, 2) + 0.3, xpd = TRUE)
plot(fit3, se = TRUE, lcol = "blue", scol = "limegreen", lty = 1,
     lwd = 2, slwd = 2, slty = "dashed")

# Quantile plot --- plots all the fitted values
qtplot(fit3, mpv = TRUE, lcol = c(1, 2, 5), tcol = c(1, 2, 5), lwd = 2,
       pcol = "blue", tadj = 0.1, ylab = "Sea level (cm)")

# Plot the 99 percentile only
year <- venice[["year"]]
matplot(year, Y, ylab = "Sea level (cm)", type = "n")
matpoints(year, Y, pch = "*", col = "blue")
lines(year, fitted(fit3)[, "99%"], lwd = 2, col = "orange")

# Check the 99 percentiles with a smoothing spline.
# Nb. (1-0.99) * 365 = 3.65 is approx. 4, meaning the 4th order
# statistic is approximately the 99 percentile.
plot(year, Y[, 4], ylab = "Sea level (cm)", type = "n",
     main = "Orange is 99 percentile, Green is a smoothing spline")
points(year, Y[, 4], pch = "4", col = "blue")
lines(year, fitted(fit3)[, "99%"], lty = 1, col = "orange")
lines(smooth.spline(year, Y[, 4], df = 4), col = "limegreen", lty = 2)

Run the code above in your browser using DataLab