Learn R Programming

fExtremes (version 4032.84)

ExtremesData: Explorative Data Analysis

Description

A collection and description of functions for explorative data analysis. The tools include plot functions for empirical distributions, quantile plots, graphs exploring the properties of exceedances over a threshold, plots for mean/sum ratio and for the development of records.

The functions are:

emdPlotPlot of empirical distribution function,
qqparetoPlotExponential/Pareto quantile plot,
mePlotPlot of mean excesses over a threshold,
mrlPlotanother variant, mean residual life plot,
mxfPlotanother variant, with confidence intervals,
msratioPlotPlot of the ratio of maximum and sum,
recordsPlotRecord development compared with iid data,
ssrecordsPlotanother variant, investigates subsamples,
sllnPlotverifies Kolmogorov's strong law of large numbers,
lilPlotverifies Hartman-Wintner's law of the iterated logarithm,
xacfPlotACF of exceedances over a threshold,
normMeanExcessFitfits mean excesses with a normal density,
ghMeanExcessFitfits mean excesses with a GH density,
hypMeanExcessFitfits mean excesses with a HYP density,
nigMeanExcessFitfits mean excesses with a NIG density,
ghtMeanExcessFitfits mean excesses with a GHT density.

Usage

emdPlot(x, doplot = TRUE, plottype = c("xy", "x", "y", " "), 
    labels = TRUE, ...)

qqparetoPlot(x, xi = 0, trim = NULL, threshold = NULL, doplot = TRUE, labels = TRUE, ...)

mePlot(x, doplot = TRUE, labels = TRUE, ...) mrlPlot(x, ci = 0.95, umin = mean(x), umax = max(x), nint = 100, doplot = TRUE, plottype = c("autoscale", ""), labels = TRUE, ...) mxfPlot(x, u = quantile(x, 0.05), doplot = TRUE, labels = TRUE, ...) msratioPlot(x, p = 1:4, doplot = TRUE, labels = TRUE, ...) recordsPlot(x, ci = 0.95, doplot = TRUE, labels = TRUE, ...) ssrecordsPlot(x, subsamples = 10, doplot = TRUE, plottype = c("lin", "log"), labels = TRUE, ...) sllnPlot(x, doplot = TRUE, labels = TRUE, ...) lilPlot(x, doplot = TRUE, labels = TRUE, ...)

xacfPlot(x, u = quantile(x, 0.95), lag.max = 15, doplot = TRUE, which = c("all", 1, 2, 3, 4), labels = TRUE, ...) normMeanExcessFit(x, doplot = TRUE, trace = TRUE, ...) ghMeanExcessFit(x, doplot = TRUE, trace = TRUE, ...) hypMeanExcessFit(x, doplot = TRUE, trace = TRUE, ...) nigMeanExcessFit(x, doplot = TRUE, trace = TRUE, ...) ghtMeanExcessFit(x, doplot = TRUE, trace = TRUE, ...)

Value

The functions return a plot.

Arguments

ci

[recordsPlot] -
a confidence level. By default 0.95, i.e. 95%.

doplot

a logical value. Should the results be plotted? By default TRUE.

labels

a logical value. Whether or not x- and y-axes should be automatically labelled and a default main title should be added to the plot. By default TRUE.

lag.max

[xacfPlot] -
maximum number of lags at which to calculate the autocorrelation functions. The default value is 15.

nint

[mrlPlot] -
the number of intervals, see umin and umax. The default value is 100.

p

[msratioPlot] -
the power exponents, a numeric vector. By default a sequence from 1 to 4 in unit integer steps.

plottype

[emdPlot] -
which axes should be on a log scale: "x" x-axis only; "y" y-axis only; "xy" both axes; "" neither axis.
[msratioPlot] -
a logical, if set to "autoscale", then the scale of the plots are automatically determined, any other string allows user specified scale information through the ... argument.
[ssrecordsPlot] -
one from two options can be select either "lin" or "log". The default creates a linear plot.

subsamples

[ssrecordsPlot] -
the number of subsamples, by default 10, an integer value.

threshold, trim

[qPlot][xacfPlot] -
a numeric value at which data are to be left-truncated, value at which data are to be right-truncated or the threshold value, by default 95%.

trace

a logical flag, by default TRUE. Should the calculations be traced?

u

a numeric value at which level the data are to be truncated. By default the threshold value which belongs to the 95% quantile, u=quantile(x,0.95).

umin, umax

[mrlPlot] -
range of threshold values. If umin and/or umax are not available, then by default they are set to the following values: umin=mean(x) and umax=max(x).

which

[xacfPlot] -
a numeric or character value, if which="all" then all four plots are displayed, if which is an integer between one and four, then the first, second, third or fourth plot will be displayed.

x, y

numeric data vectors or in the case of x an object to be plotted.

xi

the shape parameter of the generalized Pareto distribution.

...

additional arguments passed to the FUN or plot function.

Author

Some of the functions were implemented from Alec Stephenson's R-package evir ported from Alexander McNeil's S library EVIS, Extreme Values in S, some from Alec Stephenson's R-package ismev based on Stuart Coles code from his book, Introduction to Statistical Modeling of Extreme Values and some were written by Diethelm Wuertz.

Details

Empirical Distribution Function:

The function emdPlot is a simple explanatory function. A straight line on the double log scale indicates Pareto tail behaviour.

Quantile--Quantile Pareto Plot:

qqparetoPlot creates a quantile-quantile plot for threshold data. If xi is zero the reference distribution is the exponential; if xi is non-zero the reference distribution is the generalized Pareto with that parameter value expressed by xi. In the case of the exponential, the plot is interpreted as follows: Concave departures from a straight line are a sign of heavy-tailed behaviour, convex departures show thin-tailed behaviour.

Mean Excess Function Plot:

Three variants to plot the mean excess function are available: A sample mean excess plot over increasing thresholds, and two mean excess function plots with confidence intervals for discrimination in the tails of a distribution. In general, an upward trend in a mean excess function plot shows heavy-tailed behaviour. In particular, a straight line with positive gradient above some threshold is a sign of Pareto behaviour in tail. A downward trend shows thin-tailed behaviour whereas a line with zero gradient shows an exponential tail. Here are some hints: Because upper plotting points are the average of a handful of extreme excesses, these may be omitted for a prettier plot. For mrlPlot and mxfPlot the upper tail is investigated; for the lower tail reverse the sign of the data vector.

Plot of the Maximum/Sum Ratio:

The ratio of maximum and sum is a simple tool for detecting heavy tails of a distribution and for giving a rough estimate of the order of its finite moments. Sharp increases in the curves of a msratioPlot are a sign for heavy tail behaviour.

Plot of the Development of Records:

These are functions that investigate the development of records in a dataset and calculate the expected behaviour for iid data. recordsPlot counts records and reports the observations at which they occur. In addition subsamples can be investigated with the help of the function ssrecordsPlot.

Plot of Kolmogorov's and Hartman-Wintner's Laws:

The function sllnPlot verifies Kolmogorov's strong law of large numbers, and the function lilPlot verifies Hartman-Wintner's law of the iterated logarithm.

ACF Plot of Exceedances over a Threshold:

This function plots the autocorrelation functions of heights and distances of exceedances over a threshold.

References

Coles S. (2001); Introduction to Statistical Modelling of Extreme Values, Springer.

Embrechts, P., Klueppelberg, C., Mikosch, T. (1997); Modelling Extremal Events, Springer.

Examples

Run this code
## Danish fire insurance data:
   data(danishClaims)
   library(timeSeries)
   danishClaims = as.timeSeries(danishClaims)
   
## emdPlot -
   # Show Pareto tail behaviour:
   par(mfrow = c(2, 2), cex = 0.7)
   emdPlot(danishClaims) 
   
## qqparetoPlot -
   # QQ-Plot of heavy-tailed Danish fire insurance data:
   qqparetoPlot(danishClaims, xi = 0.7) 
 
## mePlot -
   # Sample mean excess plot of heavy-tailed Danish fire:
   mePlot(danishClaims)
      
## ssrecordsPlot -
   # Record fire insurance losses in Denmark:
   ssrecordsPlot(danishClaims, subsamples = 10) 

Run the code above in your browser using DataLab