Learn R Programming

fExtremes (version 220.10063)

GpdFit: Modelling the Generalized Pareto Distribution

Description

A collection and description of functions to model the Generalized Pareto Distribution, GPD, based on R's 'evir' package. Two approaches for parameter estimation are provided: Maximum likelihood estimation and the probability weighted moment method. The functions are: ll{ gpdSim generates data from the GPD, gpdFit fits empirical or simulated data to the distribution, print print method for a fitted GPD object of class ..., plot plot method for a fitted GPD object, summary summary method for a fitted GPD object, gpdqPlot estimation of high quantiles, gpdquantPlot variation of high quantiles with threshold, gpdriskmeasures prescribed quantiles and expected shortfalls, gpdsfallPlot expected shortfall with confidence intervals, gpdshapePlot variation of shape with threshold, gpdtailPlot plot of the tail. }

Usage

gpdSim(model = list(shape = 0.25, location = 0, scale = 1), n = 1000)
gpdFit(x, threshold = NA, nextremes = NA, type = c("mle", "pwm"),
    information = c("observed", "expected"), ...)

## S3 method for class 'gpdFit':
print(x, \dots)
## S3 method for class 'gpdFit':
plot(x, which = "all", \dots)
## S3 method for class 'gpdFit':
summary(object, doplot = TRUE, which = "all", \dots)

gpdqPlot(x, pp = 0.99, ci.type = c("likelihood", "wald"), ci.p = 0.95, 
    like.num = 50)
gpdquantPlot(data, p = 0.99, models = 30, start = 15, end = 500, 
    reverse = TRUE, ci = 0.95, autoscale = TRUE, labels = TRUE, ...)
gpdriskmeasures(x, plevels = c(0.99, 0.995, 0.999, 0.9995, 0.9999))
gpdsfallPlot(x, pp = 0.99, ci.p = 0.95, like.num = 50)
gpdshapePlot(data, models = 30, start = 15, end = 500, reverse = TRUE,
    ci = 0.95, autoscale = TRUE, labels = TRUE, ...) 
gpdtailPlot(fit, optlog = NA, extend = 1.5, labels = TRUE, ...)

Arguments

autoscale
whether or not plot should be automatically scaled; if not, xlim and ylim graphical parameters may be entered.
ci
the probability for asymptotic confidence band; for no confidence band set to zero.
ci.p
the probability for confidence interval (must be less than 0.999).
ci.type
the method for calculating a confidence interval: "likelihood" or "wald".
data
a numeric vector of data.
doplot
a logical. Should the results be plotted?
extend
optional argument for plots 1 and 2 expressing how far x-axis should extend as a multiple of the largest data value. This argument must take values greater than 1 and is useful for showing estimated quantiles beyond data.
fit
[print][plot][summary] - print method, a fitted object of class "gpd".
information
whether standard errors should be calculated with "observed" or "expected" information. This only applies to the maximum likelihood method; for the probability-weighted moments method "expected"
labels
optional argument for plots 1 and 2 specifying whether or not axes should be labelled.
like.num
the number of times to evaluate profile likelihood.
model
[gpdsim] - a list with components shape, location and scale giving the parameters of the GPD distribution. By default the shape parameter has the value 0.25, the location is zero and the
models
the number of consecutive gpd models to be fitted.
n
[gpdsim] - lnumber of generated data points, an integer value.
nextremes
[gpdFit] - the number of upper extremes to be used (either this or threshold must be given but not both).
object
[summary] - a fitted object of class "gpdFit".
optlog
optional argument for plots 1 and 2 giving a particular choice of logarithmic axes: "x" x-axis only; "y" y-axis only; "xy" both axes; "" neither axis.
plevels, p, pp
a vector of probability levels, the desired probability for the quantile estimate (e.g. 0.99 for the 99th percentile).
reverse
should plot be by increasing threshold (TRUE) or number of extremes (FALSE).
start, end
the lowest and maximum number of exceedances to be considered.
threshold
a threshold value (either this or nextremes must be given but not both).
type
a character string selecting the desired estimation mehtod, either "mle" for the maximum likelihood mehtod or "pwm" for the probability weighted moment method. By default, the first will be selected.
which
if which is set to "ask" the function will interactively ask which plot should be displayed. By default this value is set to FALSE and then those plots will be displayed for which the elem
x
[gpdFit] - the data vector. Note, there are two different names for the first argument x and data depending which function name is used, either gpdFit or the EVIS synonyme gpd
...
control parameters and plot parameters optionally passed to the optimization and/or plot function. Parameters for the optimization function are passed to components of the control argument of optim.

Value

  • gpdSim returns a vector of datapoints from the simulated series. gpdFit returns an object of class "gpd" describing the fit including parameter estimates and standard errors. gpdquantPlot returns invisible a table of results. gpdshapePlot returns invisible a table of results. gpdtailPlot returns invisible a list object containing details of the plot is returned invisibly. This object should be used as the first argument of gpdqPlot or gpdsfallPlot to add quantile estimates or expected shortfall estimates to the plot.

Details

Simulation: gpdSim simulates data from a Generalized Pareto distribution. Parameter Estimation: gpdFit fits the model parameters either by the probability weighted moment method or the maxim log likelihood method. The function returns an object of class "gpd" representing the fit of a generalized Pareto model to excesses over a high threshold. The fitting functions use the probability weighted moment method, if method method="pwm" was selected, and the the general purpose optimization function optim when the maximum likelihood estimation, method="mle" or method="ml" is chosen. Methods: print.gpd, plot.gpd and summary.gpd are print, plot, and summary methods for a fitted object of class gpdFit. The plot method provides four different plots for assessing fitted GPD model. gpd* Functions: gpdqPlot calculates quantile estimates and confidence intervals for high quantiles above the threshold in a GPD analysis, and adds a graphical representation to an existing plot. The GPD approximation in the tail is used to estimate quantile. The "wald" method uses the observed Fisher information matrix to calculate confidence interval. The "likelihood" method reparametrizes the likelihood in terms of the unknown quantile and uses profile likelihood arguments to construct a confidence interval. gpdquantPlot creates a plot showing how the estimate of a high quantile in the tail of a dataset based on the GPD approximation varies with threshold or number of extremes. For every model gpdFit is called. Evaluation may be slow. Confidence intervals by the Wald method may be fastest. gpdriskmeasures makes a rapid calculation of point estimates of prescribed quantiles and expected shortfalls using the output of the function gpdFit. This function simply calculates point estimates and (at present) makes no attempt to calculate confidence intervals for the risk measures. If confidence levels are required use gpdqPlot and gpdsfallPlot which interact with graphs of the tail of a loss distribution and are much slower. gpdsfallPlot calculates expected shortfall estimates, in other words tail conditional expectation and confidence intervals for high quantiles above the threshold in a GPD analysis. A graphical representation to an existing plot is added. Expected shortfall is the expected size of the loss, given that a particular quantile of the loss distribution is exceeded. The GPD approximation in the tail is used to estimate expected shortfall. The likelihood is reparametrised in terms of the unknown expected shortfall and profile likelihood arguments are used to construct a confidence interval. gpdshapePlot creates a plot showing how the estimate of shape varies with threshold or number of extremes. For every model gpdFit is called. Evaluation may be slow. gpdtailPlot produces a plot of the tail of the underlying distribution of the data.

References

Hosking J.R.M., Wallis J.R., (1987); Parameter and quantile estimation for the generalized Pareto distribution, Technometrics 29, 339--349.

Examples

Run this code
## SOURCE("fExtremes.53B-GpdFit")

## Load Data:
   data(danish)
   
## gpdSim - 
   # Simulate GPD Data:
   xmpExtremes("Start: Simulate a GPD Distributed Sample > ")
   x = gpdSim(model = list(shape = 0.25, location = 0, scale = 1), n = 1000)
   
## gpdFit -   
   xmpExtremes("Next: Fit Simulated Data to GPD using PWM > ")
   fit = gpdFit(x, nextremes = length(x), type = "pwm") 
   print(fit)
   par(mfcol = c(4, 2), cex = 0.7)
   summary(fit)  
   
## gpdFit -
   xmpExtremes("Next: Fit Simulated Data to GPD using MLE > ")
   fit = gpdFit(x, nextremes = length(x), type = "mle") 
   print(fit)
   summary(fit) 

## gpdFit - 
   xmpExtremes("Next: Fit Danish Fire Data to Excess Losses over 10 > ")
   fit = gpdFit(danish, 10, type = "mle") 
   print(fit)
   par(mfrow = c(2, 2), cex = 0.7)
   summary(fit)  
     
## gpdqPlot - 
   xmpExtremes("Next: 99.5th Percentiles for Danish Fire Data > ")
   fit = gpdFit(danish, threshold = 10, type = "mle")
   par(mfrow = c(1, 1))
   tail = gpdtailPlot(fit)
   gpdqPlot(tail, 0.995)
   title(main = "Danish Data: 99.5th Percentile") 
	
## gpdquantPlot - 	
   xmpExtremes("Next: 99.9th Percentiles for Danish Fire Data > ")
   par(mfrow = c(1, 1))
   gpdquantPlot(danish, p = 0.999)
   title(sub = "Danish Fire: GPD High Quantile") 

## gpdsfallPlot - 
   xmpExtremes("Next: Expected Shortfall for Danish Fire Data > ")
   fit = gpdFit(danish, nextremes = floor(length(danish)/10), type = "mle")
   par(mfrow = c(1, 1))
   tp = gpdtailPlot(fit)
   gpdsfallPlot(tp, 0.999)   
   title(main = "Danish Fire: Expected Shortfall")   
   
## gpdriskmeasures -
   xmpExtremes("Next: Quantiles and Expected Shortfalls > ")
   # Give estimates of 0.999 and 0.9999 quantiles - Danish Fire Date:   
   fit = gpdFit(danish, threshold = 10, type = "mle") 
   par(mfrow = c(1, 1))
   gpdriskmeasures(fit, c(0.99, 0.995, 0.999, 0.9995, 0.9999))   
  
## gpdshapePlot - 
   xmpExtremes("Next: Shape Plot of Heavy-Tailed Simulated Data > ")
   set.seed(4711)
   par(mfrow = c(1, 1))
   gpdshapePlot(gpdSim(n = 1000))
   title(sub = "Simulated GPD", cex.sub = 0.7)   

## gpdshapePlot - 
   xmpExtremes("Next: Shape Plot of Heavy-Tailed Danish Fire Data > ")
   par(mfrow = c(1, 1))
   gpdshapePlot(danish)
   title(sub = "Danish Fire", cex.sub = 0.7)

## gpdtailPlot - 
   xmpExtremes("Next: Plot Tail Estimate of Danish Fire Data >")
   fit = gpdFit(danish, threshold = 10, type = "mle")
   par(mfrow = c(1, 1))
   gpdtailPlot(fit, main = "Danish Fire: GPD Tail Estimate", col = "steelblue4")

Run the code above in your browser using DataLab