mex: Conditional multivariate extreme values modelling

Description

Fit the conditional multivariate extreme value model of Heffernan and Tawn

Usage

mex(data, which, mth, mqu, dqu, margins = "laplace", constrain = TRUE, v = 10, 
    penalty = "gaussian", maxit = 10000, trace = 0, verbose = FALSE, priorParameters = NULL)
## S3 method for class 'mex':
plot(x, quantiles = seq(0.1, by = 0.2, len = 5), col = "grey", ...)
## S3 method for class 'mex':
predict(object, which, pqu=0.99, nsim=1000, trace=10, ...)
## S3 method for class 'predict.mex':
summary(object, mth, probs=c(0.05, 0.5, 0.95), ...)
## S3 method for class 'predict.mex':
plot(x, pch=c(1, 3), col=c(2, 8), cex=c(1, 1), ask=TRUE, ...)

Arguments

data

A numeric matrix or data.frame, the columns of which are to be modelled.

which

The variable on which to condition. This can be either scalar, indicating the column number of the conditioning variable, or character, giving the column name of the conditioning variable.

mth

In mex, the threshold above which to fit generalized Pareto distributions. If this is a vector of length 1, the same threshold will be used for each variable. Otherwise, it should be a vector whose length is

mqu

As an alternative to specifying the marginal GPD fitting thresholds via mth, you can specify the quantile (a probability) above which to fit generalized Pareto distributions. If this is a vector of length 1, the same quantile

dqu

Used to specify the quantile at which to threshold the conditioning variable data when estimating the dependence parameters. For example dqu=0.7 will result in the data with the highest 30 threshold will be

margins

See documentation for mexDependence.

constrain

See documentation for mexDependence.

penalty

How to penalize the likelihood when estimating the marginal generalized Pareto distributions. Defaults to ``gaussian''. See the help file for gpd for more information.

maxit

The maximum number of iterations to be used by the optimizer. defaults to maxit = 10000.

trace

Passed internall to optim. Whether or not to inform the user of the progress of the optimizer. Defaults to 0, indicating no trace.

verbose

Whether or not to keep the user informed of progress. Defaults to verbose = FALSE.

priorParameters

Parameters of prior/penalty used for estimation of the GPD parameters. This is only used if penalty = 'gaussian'. It is a named list, each element of which contains two components: the first component s

quantiles

A vector of quantiles taking values between 0 and 1 specifying the quantiles of the conditional distributions which will be plotted.

col

In plot method for objects of class mex, the color for points on scatterplots of residuals and original data respectively. In plot method for objects of class predict.mex, the colours of poi

x, object

Object of class mex as returned by function mex.

pqu

Argument to predict method. The quantile of the conditioning variable above which it will be simulated for importance sampling based prediction. Defaults to pqu = .99.

nsim

Argument to predict method. The number of simulated observations to be generated for prediction.

probs

In summary method for objects of class predict.mex: the quantiles of the conditional distribution(s) to calculate. Defaults to 5%, 50% and 95%.

pch, cex

Plotting characters: colours and symbol expansion. The observed and simulated data are plotted using different symbols, controlled by these arguments and col, each of which should be of length 2.

ask

Whether or not to ask before changing the plot. Defaults to ask = TRUE.

...

Further arguments to be passed to methods.

Value

A call to mex returns an list of class mex containing the following three items:
marginsAn object of class migpd.
dependenceAn object of class mexDependence.
callThis matches the original function call.
There are plot, summary, coef and predict methods for this class.
A call to predict.mex does the importance sampling for prediction, and returns a list of class "predict.mex" for which there are print and plot methods available. The summary method for this class of object is intended to be used following a call to the predict method, to estimate quantiles or probabilities of threshold excesses for the fitted conditional distributions given the conditioning variable above the threshold for prediction. See examples below.
There are print, summary and plot methods available for the class 'predict.mex'.

Details

The function mex works as follows. First, Generalized Pareto distributions (GPD) are fitted to the upper tails of each of the marginal distributions of the data: the GPD parameters are estimated for each column of the data in turn, independently of all other columns. Then, the conditional multivariate approach of Heffernan and Tawn is used to model the dependence between variables. The returned object is of class 'mex'. This function is a wrapper for calls to migpd and mexDependence, which estimate parameters of the marginal and dependence components of the Heffernan and Tawn model respectively. See documentation of these functions for details of modelling issues including the use of penalties / priors, threshold choice and checking for convergence of parameter estimates.

The plot method produces diagnostic plots for the fitted dependence model described by Heffernan and Tawn, 2004. The plots are best viewed by using the plotting area split by par(mfcol=c(.,.)) rather than mfrow, see examples below. Three diagnostic plots are produced for each dependent variable: 1) Scatterplots of the residuals Z from the fitted model of Heffernan and Tawn (2004) are plotted against the quantile of the conditioning variable, with a lowess curve showing the local mean of these points. 2) The absolute value of Z-mean(Z) is also plotted, again with the lowess curve showing the local mean of these points. Any trend in the location or scatter of these variables with the conditioning variable indicates a violation of the model assumption that the residuals Z are indepenendent of the conditioning variable. This can be indicative of the dependence threshold used being too low. 3) The final plots show the original data (on the original scale) and the fitted quantiles (specified by quantiles) of the conditional distribution of each dependent variable given the conditioning variable. A model that fits well will have good agreement between the distribution of the raw data (shown by the scatter plot) and the fitted quantiles. Note that the raw data are a sample from the joint distribution, whereas the quantiles are those of the estimated conditional distribution given the value of the conditioning variable, and while these two distributions should move into the same part of the sample space as the conditioning variable becomes more extreme, they are not the same thing!

The predict method for mex works as follows. The returned object has class 'predict.mex'. Simulated values of the dependent variables are created, given that the conditioning variable is above its 100pqu quantile. If predict.mex is passed an object object of class "mex" then the simulated values are based only on the point estimate of the dependence model parameters, and the original data. If predict.mex is passed an object object of class "bootmex" then the returned value additionally contains simulated replicate data sets corresponding to the bootstrap model parameter estimates. In both cases, the simulated values based on the original data and point estimates appear in component object$data$simulated. The simulated data from the bootstrap estimates appear in object$replicates. The plot method for class "oredict.mex" displays both the original data and the simulated data generated above the threshold for prediction; it shows the threshold for prediction (vertical line) and also the curve joining equal quantiles of the marginal distributions -- this is for reference: variables that are perfectly dependent will lie exactly on this curve.

References

J. E. Heffernan and J. A. Tawn, A conditional approach for multivariate extreme values, Journal of the Royal Statistical Society B, 66, 497 - 546, 2004

Examples

Run this code

w <- mex(winter, mqu=.7)
w
par(mfcol=c(3, 2))
plot(w)

par(mfcol=c(2,2))
p <- predict(w)
summary(p)
summary(p,probs=c(0.01,0.2,0.5,0.8,0.99))
summary(p,probs=0.5,mth=c(40,50,150,25,50))
p
plot(p)

Run the code above in your browser using DataLab