Calculates sufficient statistics for the \(K\)-gaps model for the extremal
index \(\theta\). Called by kgaps
.
kgaps_stat(data, u, q_u, k = 1, inc_cens = TRUE)
A list containing the sufficient statistics, with components
N0
the number of zero \(K\)-gaps.
N1
contribution from non-zero \(K\)-gaps (see Details).
sum_qs
the sum of the (scaled) \(K\)-gaps, that is, \(q (S_0 + \cdots + S_N)\), where \(q\) is estimated by the proportion of threshold exceedances.
n_kgaps
the number of \(K\)-gaps that contribute to the log-likelihood.
A numeric vector of raw data.
A numeric scalar. Extreme value threshold applied to data.
A numeric scalar. An estimate of the probability with which
the threshold u
is exceeded. If q_u
is missing then it is
calculated using mean(data > u, na.rm = TRUE)
.
A numeric scalar. Run parameter \(K\), as defined in Suveges and
Davison (2010). Threshold inter-exceedances times that are not larger
than k
units are assigned to the same cluster, resulting in a
\(K\)-gap equal to zero. Specifically, the \(K\)-gap \(S\)
corresponding to an inter-exceedance time of \(T\) is given by
\(S = \max(T - K, 0)\).
A logical scalar indicating whether or not to include contributions from right-censored inter-exceedance times relating to the first and last observation. It is known that these times are greater than or equal to the time observed. See Attalides (2015) for details.
The sample \(K\)-gaps are \(S_0, S_1, ..., S_{N-1}, S_N\), where \(S_1, ..., S_{N-1}\) are uncensored and \(S_0\) and \(S_N\) are right-censored. Under the assumption that the \(K\)-gaps are independent, the log-likelihood of the \(K\)-gaps model is given by $$l(\theta; S_0, \ldots, S_N) = N_0 \log(1 - \theta) + 2 N_1 \log \theta - \theta q (S_0 + \cdots + S_N),$$ where
\(q\) is the threshold exceedance probability, estimated by the proportion of threshold exceedances,
\(N_0\) is the number of uncensored sample \(K\)-gaps that are equal to zero,
(apart from an adjustment for the contributions of \(S_0\) and \(S_N\)) \(N_1\) is the number of positive sample \(K\)-gaps,
specifically, if inc_cens = TRUE
then \(N_1\) is equal
to the number of \(S_1, ..., S_{N-1}\)
that are positive plus \((I_0 + I_N) / 2\), where \(I_0 = 1\) if
\(S_0\) is greater than zero and \(I_0 = 0\) otherwise, and
similarly for \(I_N\).
The differing treatment of uncensored and right-censored \(K\)-gaps reflects differing contributions to the likelihood. Right-censored \(K\)-gaps that are equal to zero add no information to the likelihood. For full details see Suveges and Davison (2010) and Attalides (2015).
If \(N_1 = 0\) then we are in the degenerate case where there is one cluster (all \(K\)-gaps are zero) and the likelihood is maximized at \(\theta = 0\).
If \(N_0 = 0\) then all exceedances occur singly (all \(K\)-gaps are positive) and the likelihood is maximized at \(\theta = 1\).
Suveges, M. and Davison, A. C. (2010) Model misspecification in peaks over threshold analysis, Annals of Applied Statistics, 4(1), 203-221. tools:::Rd_expr_doi("10.1214/09-AOAS292")
Attalides, N. (2015) Threshold-based extreme value modelling, PhD thesis, University College London. https://discovery.ucl.ac.uk/1471121/1/Nicolas_Attalides_Thesis.pdf
kgaps
for maximum likelihood estimation of the
extremal index \(\theta\) using the \(K\)-gaps model.
u <- quantile(newlyn, probs = 0.90)
kgaps_stat(newlyn, u)
Run the code above in your browser using DataLab