TH: Sequential Goodness of Fit Testing for the Generalized Pareto Distribution

Description

An implementation of the sequential testing procedure proposed in Thompson et al. (2009) for automated threshold selection

Usage

TH(data, thresholds)

Arguments

data

vector of sample data

thresholds

a sequence of pre-defined thresholds to check for GPD assumption

Value

threshold

the threshold used for the test

num.above

the number of observations above the given threshold

p.values

raw p-values for the thresholds tested

ForwardStop

transformed p-values according to the ForwardStop criterion. See G'Sell et al (2016) for more information

StrongStop

transformed p-values according to the StrongStop criterion. See G'Sell et al (2016) for more information

est.scale

estimated scale parameter for the given threshold

est.shape

estimated shape parameter for the given threshold

Details

The procedure proposed in Thompson et al. (2009) is based on sequential goodness of fit testing. First, one has to choose a equally spaced grid of posssible thresholds. The authors recommend 100 thresholds between the 50 percent and 98 percent quantile of the data, provided there are enough observations left (about 100 observations above the last pre-defined threshold). Then the parameters of a GPD for each threshold are estimated. One can show that the differences of subsequent scale parameters are approximately normal distributed. So a Pearson chi-squared test for normality is applied to all the differences, striking the smallest thresholds out until the test is not rejected anymore.

References

Thompson, P. and Cai, Y. and Reeve, D. (2009). Automated threshold selection methods for extreme wave analysis. Coastal Engineering, 56(10), 1013--1021.

G'Sell, M.G. and Wager, S. and Chouldechova, A. and Tibshirani, R. (2016). Sequential selection procedures and false discovery rate control. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 78(2), 423--444.

Examples

Run this code

# NOT RUN {
data=rexp(1000)
u=seq(quantile(data,.1),quantile(data,.9),,100)
A=TH(data,u);A
# }

Run the code above in your browser using DataLab