fgev: Maximum-likelihood Fitting of the Generalized Extreme Value Distribution

Description

Maximum-likelihood fitting for the generalized extreme value distribution, including linear modelling of the location parameter, and allowing any of the parameters to be held fixed if desired.

Usage

fgev(x, start, …, nsloc = NULL, prob = NULL, std.err = TRUE,
    corr = FALSE, method = "BFGS", warn.inf = TRUE)

Arguments

A numeric vector, which may contain missing values.

start

A named list giving the initial values for the parameters over which the likelihood is to be maximized. If start is omitted the routine attempts to find good starting values using moment estimators.

…

Additional parameters, either for the GEV model or for the optimization function optim. If parameters of the model are included they will be held fixed at the values given (see Examples).

nsloc

A data frame with the same number of rows as the length of x, for linear modelling of the location parameter. The data frame is treated as a covariate matrix (excluding the intercept). A numeric vector can be given as an alternative to a single column data frame.

prob

Controls the parameterization of the model (see Details). Should be either NULL (the default), or a probability in the closed interval [0,1].

std.err

Logical; if TRUE (the default), the standard errors are returned.

corr

Logical; if TRUE, the correlation matrix is returned.

method

The optimization method (see optim for details).

warn.inf

Logical; if TRUE (the default), a warning is given if the negative log-likelihood is infinite when evaluated at the starting values.

Value

Returns an object of class c("gev","uvevd","evd").

The generic accessor functions fitted (or fitted.values), std.errors, deviance, logLik and AIC extract various features of the returned object.

The functions profile and profile2d are used to obtain deviance profiles for the model parameters. In particular, profiles of the quantile $z_p$ can be calculated and plotted when $\code{prob} = p$. The function anova compares nested models. The function plot produces diagnostic plots.

An object of class c("gev","uvevd","evd") is a list containing at most the following components

estimate

A vector containing the maximum likelihood estimates.

std.err

A vector containing the standard errors.

fixed

A vector containing the parameters of the model that have been held fixed.

param

A vector containing all parameters (optimized and fixed).

deviance

The deviance at the maximum likelihood estimates.

corr

The correlation matrix.

var.cov

The variance covariance matrix.

convergence, counts, message

Components taken from the list returned by optim.

data

The data passed to the argument x.

tdata

The data, transformed to stationarity (for non-stationary models).

nsloc

The argument nsloc.

The length of x.

prob

The argument prob.

loc

The location parameter. If prob is NULL (the default), this will also be an element of param.

call

The call of the current function.

Warning

The standard errors and the correlation matrix in the returned object are taken from the observed information, calculated by a numerical approximation. They must be interpreted with caution when the shape parameter is less than $-0.5$, because the usual asymptotic properties of maximum likelihood estimators do not then hold (Smith, 1985).

Details

If prob is NULL (the default):

For stationary models the parameter names are loc, scale and shape, for the location, scale and shape parameters respectively. For non-stationary models, the parameter names are loc, locx1, …, locxn, scale and shape, where x1, …, xn are the column names of nsloc, so that loc is the intercept of the linear model, and locx1, …, locxn are the ncol(nsloc) coefficients. If nsloc is a vector it is converted into a single column data frame with column name trend, and hence the associated trend parameter is named loctrend.

If $\code{prob} = p$ is a probability:

The fit is performed using a different parameterization. Let $a$, $b$ and $s$ denote the location, scale and shape parameters of the GEV distribution. For stationary models, the distribution is parameterized using $(z_p,b,s)$, where $$z_p = a - b/s (1 - (-\log(1 - p))^s)$$ is such that $G(z_p) = 1 - p$, where $G$ is the GEV distribution function. $\code{prob} = p$ is therefore the probability in the upper tail corresponding to the quantile $z_p$. If prob is zero, then $z_p$ is the upper end point $a - b/s$, and $s$ is restricted to the negative (Weibull) axis. If prob is one, then $z_p$ is the lower end point $a - b/s$, and $s$ is restricted to the positive (Frechet) axis. The parameter names are quantile, scale and shape, for $z_p$, $b$ and $s$ respectively.

For non-stationary models the parameter $z_p$ is again given by the equation above, but $a$ becomes the intercept of the linear model for the location parameter, so that quantile replaces (the intercept) loc, and hence the parameter names are quantile, locx1, …, locxn, scale and shape, where x1, …, xn are the column names of nsloc.

In either case:

For non-stationary fitting it is recommended that the covariates within the linear model for the location parameter are (at least approximately) centered and scaled (i.e.\ that the columns of nsloc are centered and scaled), particularly if automatic starting values are used, since the starting values for the associated parameters are then zero.

References

Smith, R. L. (1985) Maximum likelihood estimation in a class of non-regular cases. Biometrika, 72, 67--90.

Examples

Run this code

# NOT RUN {
uvdata <- rgev(100, loc = 0.13, scale = 1.1, shape = 0.2)
trend <- (-49:50)/100
M1 <- fgev(uvdata, nsloc = trend, control = list(trace = 1))
M2 <- fgev(uvdata)
M3 <- fgev(uvdata, shape = 0)
M4 <- fgev(uvdata, scale = 1, shape = 0)
anova(M1, M2, M3, M4)
par(mfrow = c(2,2))
plot(M2)
# }
# NOT RUN {
M2P <- profile(M2)
# }
# NOT RUN {
plot(M2P)
# }
# NOT RUN {
rnd <- runif(100, min = -.5, max = .5)
fgev(uvdata, nsloc = data.frame(trend = trend, random = rnd))
fgev(uvdata, nsloc = data.frame(trend = trend, random = rnd), locrandom = 0)

uvdata <- rgev(100, loc = 0.13, scale = 1.1, shape = 0.2)
M1 <- fgev(uvdata, prob = 0.1)
M2 <- fgev(uvdata, prob = 0.01)
# }
# NOT RUN {
M1P <- profile(M1, which = "quantile")
# }
# NOT RUN {
M2P <- profile(M2, which = "quantile")
# }
# NOT RUN {
plot(M1P)
# }
# NOT RUN {
plot(M2P)
# }

Run the code above in your browser using DataLab