mindist: Minimizing the distance between the empirical tail and a theoretical Pareto tail with respect to k.

Description

An Implementation of the procedure proposed in Danielsson et al. (2016) for selecting the optimal threshold in extreme value analysis.

Usage

mindist(data, ts = 0.15, method = "mad")

Arguments

data

vector of sample data

size of the upper tail the procedure is applied to. Default is 15 percent of the data

method

should be one of ks for the "Kolmogorov-Smirnov" distance metric or mad for the mean absolute deviation (default)

Value

optimal number of upper order statistics, i.e. number of exceedances or data in the tail

threshold

the corresponding threshold

tail.index

the corresponding tail index by plugging in k0 into the hill estimator

Details

The procedure proposed in Danielsson et al. (2016) minimizes the distance between the largest upper order statistics of the dataset, i.e. the empirical tail, and the theoretical tail of a Pareto distribution. The parameter of this distribution are estimated using Hill's estimator. Therefor one needs the optimal number of upper order statistics k. The distance is then minimized with respect to this k. The optimal number, denoted k0 here, is equivalent to the number of extreme values or, if you wish, the number of exceedances in the context of a POT-model like the generalized Pareto distribution. k0 can then be associated with the unknown threshold u of the GPD by saying u is the n-k0th upper order statistic. For the distance metric in use one could choose the mean absolute deviation called mad here, or the maximum absolute deviation, also known as the "Kolmogorov-Smirnov" distance metric (ks). For more information see references.

References

Danielsson, J. and Ergun, L.M. and de Haan, L. and de Vries, C.G. (2016). Tail Index Estimation: Quantile Driven Threshold Selection.

Examples

Run this code

# NOT RUN {
data(danish)
mindist(danish,method="mad")
# }

Run the code above in your browser using DataLab