spm: Semiparametric maxima estimator of the extremal index

Description

Estimates the extremal index \(\theta\) using a semiparametric block maxima estimator of Northrop (2015) ("N2015") and a variant of this estimator studied by Berghaus and Bucher (2018) ("BB2018"), using both sliding (overlapping) block maxima and disjoint (non-overlapping) block maxima. A simple modification (subtraction of \(1 / b\), where \(b\) is the block size) of the Berghaus and Bucher (2018) estimator ("BB2018b") is also calculated. Estimates of uncertainty are calculated using the asymptotic theory developed by Berghaus and Bucher (2018).

Usage

spm(
  data,
  b,
  bias_adjust = c("BB3", "BB1", "N", "none"),
  constrain = TRUE,
  varN = TRUE,
  which_dj = c("last", "first")
)

Value

A list of class c("spm", "exdex") containing the components listed below. The components that are vectors are labelled to indicate the estimator to which the constituent values relate: "N2015" for Northrop (2015), "BB2018" for Berghaus and Bucher (2018) and "BB2018b" for the modified version.

theta_sl, theta_dj: Vectors containing the estimates of \(\theta\) resulting from sliding maxima and disjoint maxima respectively.
se_sl, se_dj: The estimated standard errors associated with the estimates in theta_sl and theta_dj. The values for "BB2018" and "BB2018b" are identical.
bias_sl, bias_dj: The respective values of the bias-adjustment applied to the raw estimates, that is, the values subtracted from the raw estimates. For estimator BB2018b this includes a contribution for the subtraction of 1 / b. If bias_adjust = "N" or "none" then bias_sl and bias_dj are c(0, 0, 1 / b).
raw_theta_sl, raw_theta_dj: Vectors containing the raw estimates of \(\theta\), prior to any bias_adjustment.
uncon_theta_sl, uncon_theta_dj: The (bias_adjusted) estimates of \(\theta\) before the constraint that they lie in (0, 1] has been applied.
data_sl, data_dj: Matrices containing the \(Y\)-data and \(Z\)-data for the sliding an disjoint maxima respectively. The first columns are the \(Y\)-data, the second columns the \(Z\)-data.
sigma2dj, sigma2dj_for_sl: Estimates of the variance \(\sigma_{{\rm dj}}^2\) defined on pages 2314-2315 of Berghaus and Bucher (2018). The form of the estimates is given on page 2319. sigma2dj is used in estimating the standard error se_dj, sigma2dj_for_sl in estimating the standard error se_sl.
sigma2sl: Estimates of the variance \(\sigma_{{\rm sl}}^2\). defined on pages 2314-2315 of Berghaus and Bucher (2018). The form of the estimates is given on page 2319. sigma2dj_for_sl is used to estimate \(\sigma_{{\rm dj}}^2\) for this purpose.
b: The input value of b.
bias_adjust: The input value of bias_adjust.
call: The call to spm.

Arguments

data: A numeric vector of raw data. No missing values are allowed.
b: A numeric scalar. The block size.
bias_adjust: A character scalar. Is bias-adjustment of the raw estimate of \(\theta\) performed using the bias-reduced estimator (bias_adjust = "BB3"), derived in Section 5 of Berghaus and Bucher (2018); or a simpler version (bias_adjust = "BB1"), in which the raw estimate is multiplied by \((k-1) / k\), where \(k\) is the number of blocks; or the bias-adjustment of the empirical distribution function used to calculate the estimate, as detailed in Section 2 of Northrop (2015). When disjoint maxima are used bias_adjust = "BB1" and bias_adjust = "N" give identical estimates of the Berghaus and Bucher (2018) variant, as explained at the end of Section 5 of Berghaus and Bucher (2018). If bias_adjust = "none" then no bias-adjustment is performed.
constrain: A logical scalar. If constrain = TRUE then any estimates that are greater than 1 are set to 1, that is, they are constrained to lie in (0, 1]. This is carried out after any bias-adjustment. Otherwise, estimates that are greater than 1 may be obtained.
varN: A logical scalar. If varN = TRUE then the estimation of the sampling variance of the Northrop (2015) estimator is tailored to that estimator. Otherwise, the sampling variance derived in Berghaus and Bucher (2018) is used. See Details for further information.
which_dj: A character scalar. Determines which set of disjoint maxima are used to calculate an estimate of \(\theta\): "first", only the set whose first block starts on the first observation in x; "last", only the set whose last block ends on the last observation in x.

Details

The extremal index \(\theta\) is estimated using the semiparametric maxima estimator of Northrop (2015) and variant of this studied by Berghaus and Bucher (2018). In each case a sample of 'data' is derived from the input data data, based on the empirical distribution function of these data evaluated at the maximum values of of blocks of b contiguous values in data.

The estimators are based on an assumption that these 'data' are sampled approximately from an exponential distribution with mean \(1/\theta\). For details see page 2309 of Berghaus and Bucher (2018), where the 'data' for the Northrop (2015) estimator are denoted \(Y\) and those for the Berghaus and Bucher (2018) are denoted \(Z\). For convenience, we will refer to these values as the \(Y\)-data and the \(Z\)-data.

The approximate nature of the model for the \(Y\)-data arises from the estimation of the distribution function \(F\). A further approximation motivates the use of the \(Z\)-data. If \(F\) is known then the variable \(Z / b\) has a beta(1, \(b\theta\)) distribution, so that that is, \(Z\) has mean \(1 / (\theta + 1/b)\). Therefore, an exponential distribution with mean \(1 / (\theta + 1/b)\) may provide a better approximate model, which provides the motivation for subtracting \(1 / b\) from the Berghaus and Bucher (2018) estimator. Indeed, the resulting estimates are typically close to those of the Northrop (2015) estimator.

If sliding = TRUE then the function uses sliding block maxima, that is, the largest value observed in all (length(data) - b + 1) blocks of b observations. If sliding = FALSE then disjoint block maxima, that is, the largest values in (floor(n / b)) disjoint blocks of b observations, are used.

Estimation of the sampling variances of the estimators is based on Proposition 4.1 on page 2319 of Berghaus and Bucher (2018). For the Northrop (2015) variant the user has the choice either to use the sampling variance based on the Berghaus and Bucher (2018) estimator, i.e. the \(Z\)-data (varN = FALSE) or an analogous version tailored to the Northrop (2015) estimator that uses the \(Y\)-data (varN = TRUE).

The estimates of the sampling variances of the sliding blocks estimators are inferred from those of the disjoint blocks estimators (see page 2319 of Berhaus and Bucher (2018)). The calculation of the latter uses a set of disjoint block maxima. If length(data) is not an integer multiple of b then there will be more than one set of these, and all are equally valid. In this event we perform the calculation for all such sets and use the mean of the resulting estimates. This reduces the sampling variability of the estimates at the expense of slowing down the calculation somewhat, particularly if b is small. This may become apparent when calling spm repeatedly in choose_b.

This estimator of the sampling variance of the sliding blocks estimator is not constrained to be positive: a negative estimate may result if the block size is small. In this event no warning will be given until the returned object is printed and, for the affected estimator ("N2015" or "BB2018/BB2018b"),

the corresponding estimated standard errors using sliding blocks will be missing in se_sl in the returned object,
if bias_adjust == "BB3" then bias-adjustment based on bias_adjust == "BB1" will instead be performed when using sliding blocks, because the former relies on the estimated variances of the estimators.

Similarly, bias adjustment under adjust = "BB3" and/or subtraction of \(1 / b\) in the "BB2018b" case may, rare cases, produce a negative estimate of \(\theta\). In these instances an estimate of zero is returned, but the values returned in bias_dj and bias_sl are not changed.

References

Northrop, P. J. (2015) An efficient semiparametric maxima estimator of the extremal index. Extremes 18(4), 585-603. tools:::Rd_expr_doi("10.1007/s10687-015-0221-5")

Berghaus, B., Bucher, A. (2018) Weak convergence of a pseudo maximum likelihood estimator for the extremal index. Ann. Statist. 46(5), 2307-2335. tools:::Rd_expr_doi("10.1214/17-AOS1621")

Examples

Run this code

### Newlyn sea surges
theta <- spm(newlyn, 20)
theta
summary(theta)
coef(theta)
nobs(theta)
vcov(theta)

### S&P 500 index
theta <- spm(sp500, 100)
theta
summary(theta)

Run the code above in your browser using DataLab