tcplot: Parameter Threshold Stability Plots

Description

Plots the sample mean residual life (MRL) plot.

Usage

tcplot(data, tlim = NULL, nt = min(100, length(data)),
    p.or.n = FALSE, alpha = 0.05, ylim.xi = NULL,
    ylim.sigmau = NULL, legend.loc = "bottomleft",
    try.thresh = quantile(data, 0.9, na.rm = TRUE), ...)

  tshapeplot(data, tlim = NULL,
    nt = min(100, length(data)), p.or.n = FALSE,
    alpha = 0.05, ylim = NULL, legend.loc = "bottomleft",
    try.thresh = quantile(data, 0.9, na.rm = TRUE),
    main = "Shape Threshold Stability Plot",
    xlab = "Threshold u", ylab = "Shape Parameter", ...)

  tscaleplot(data, tlim = NULL,
    nt = min(100, length(data)), p.or.n = FALSE,
    alpha = 0.05, ylim = NULL, legend.loc = "bottomleft",
    try.thresh = quantile(data, 0.9, na.rm = TRUE),
    main = "Modified Scale Threshold Stability Plot",
    xlab = "Threshold u",
    ylab = "Modified Scale Parameter", ...)

Arguments

ylim.xi

y-axis limits for shape parameter or NULL

ylim.sigmau

y-axis limits for scale parameter or NULL

data

vector of sample data

tlim

vector of (lower, upper) limits of range of threshold to plot MRL, or NULL to use default values

number of thresholds for which to evaluate MRL

p.or.n

logical, should tail fraction (FALSE) or number of exceedances (TRUE) be given on upper x-axis

alpha

logical, significance level (0, 1)

legend.loc

location of legend (see legend)

try.thresh

vector of threshold to fit GPD using MLE and show theoretical MRL

...

further arguments to be passed to the plotting functions

ylim

y-axis limits or NULL

main

title of plot

xlab

x-axis label

ylab

y-axis label

Value

tshapeplot and tscaleplot produces the threshold stability plot for the shape and scale parameter respectively. They also returns a matrix containing columns of the threshold, number of exceedances, MLE shape/scale and their standard devation and $100(1 - \alpha)%$ Wald confidence interval. Where the observed information matrix is not obtainable the standard deviation and confidence intervals are NA. For the tscaleplot the modified scale quantities are also provided. tcplot produces both plots on one graph and outputs a merged dataframe of results.

Details

The MLE of the (modified) GPD scale and shape (xi) parameters are plotted against as reange of possible threshold. Known as the threshold stability plots (Coles, 2001). The modified scale parameter is $\sigma_u - u\xi$. If the GPD is a suitable model for a threshold $u$ then for all higher threshold $v > u$ it will also be suitable, with the shape and modified scale being constant. In practice there is sample uncertainty in the parameter estimates, which must be taken into account when choosing a threshold. The usual asymptotic Wald confidence intervals are shown based on the observed information matrix to measure this undertainty. The sampling density of the Wald normal approximation is shown by a greyscale image, where lighter greys indicate low density. A pre-chosen threshold (or more than one) can be given in try.thresh. The GPD is fitted to the excesses using maximum likelihood estimation. The estimated parameters are plot as a horizontal line which is solid above this threshold where the parameter from smaller tail fraction should be the same if the GPD is a good model (upto sample uncertainty). The threshold should always be chosen to be as low as possible to reduce sample uncertainty. Therefore, below the pre-chosen threshold, where the GPD should not be a good model, the line is dashed and the parameter estimates should now deviate from the dashed line (otherwise a lower threshold could be used). If no threshold limits are provided tlim = NULL then the lowest threshold is set to be just below the median data point and the maximum threshold is set to the 11th largest datapoint. This is a slightly lower order statistic compared to that used in the MRL plot mrlplot function to account for the fact the maximum likelihood estimation is likely to be very unreliable with 10 or fewer datapoints. The range of permitted thresholds is just below the minimum datapoint and the second largest value. If there are less unique values of data within the threshold range than the number of threshold evalations requested, then instead of a sequence of thresholds they will be set to each unique datapoint, i.e. MLE will only be applied where there is data. The missing (NA and NaN) and non-finite values are ignored. The lower x-axis is the threshold and an upper axis either gives the number of exceedances (

p.or.n =
  FALSE

) or proportion of excess (p.or.n = TRUE). Note that unlike the gpd related functions the missing values are ignored, so do not add to the lower tail fraction. But ignoring the missing values is consistent with all the other mixture model functions.

References

Scarrott, C.J. and MacDonald, A. (2012). A review of extreme value threshold estimation and uncertainty quantification. REVSTAT - Statistical Journal 10(1), 33-59. Available from http://www.ine.pt/revstat/pdf/rs120102.pdf Coles S.G. (2004). An Introduction to the Statistical Modelling of Extreme Values. Springer-Verlag: London.

Examples

Run this code

x = rnorm(1000)
tcplot(x)
tshapeplot(x, tlim = c(0, 2))
tscaleplot(x, tlim = c(0, 2), try.thresh = c(0.5, 1, 1.5))
tcplot(x, tlim = c(0, 2), try.thresh = c(0.5, 1, 1.5))

Run the code above in your browser using DataLab