N.test: Number of Records Test

Description

Performs tests based on the (weighted) number of records, $N^\omega$. The hypothesis of the classical record model (i.e., of randomness) is tested against the alternative hypothesis.

Usage

N.test(
  X,
  weights = function(t) 1,
  record = c("upper", "lower"),
  distribution = c("normal", "t", "poisson-binomial"),
  alternative = c("greater", "less"),
  correct = TRUE,
  method = c("mixed", "dft", "butler"),
  simulate.p.value = FALSE,
  B = 1000
)

Arguments

A numeric vector, matrix (or data frame).

weights

A function indicating the weight given to the different records according to their position in the series, e.g., if function(t) t-1 then $\omega_t = t-1$.

record

A character string indicating the type of record to be calculated, "upper" or "lower".

distribution

A character string indicating the asymptotic distribution of the statistic, "normal" distribution, Student's "t"-distribution or exact "poisson-binomial" distribution.

alternative

A character string indicating the type of alternative hypothesis, "greater" number of records or "less" number of records.

correct

Logical. Indicates, whether a continuity correction should be done; defaults to TRUE. No correction is done if simulate.p.value = TRUE or distribution = "poisson-binomial".

method

(If distribution = "poisson-binomial") A character string that indicates the method by which the cdf of the Poisson binomial distribution is calculated and therefore the p-value. "mixed" is the preferred (and default) method, it is a more efficient combination of the later algorithms. "dft" uses the discrete Fourier transform which algorithm is given in Hong (2013). "butler" use the algorithm given by Butler and Stephens (2016).

simulate.p.value

Logical. Indicates whether to compute p-values by Monte Carlo simulation. No simulation is done if distribution = "poisson-binomial".

If simulate.p.value = TRUE, an integer specifying the number of replicates used in the Monte Carlo estimation.

Value

A "htest" object with elements:

statistic

Value of the test statistic.

parameter

(If distribution = "t") Degrees of freedom of the $t$ statistic (equal to $M-1$).

p.value

P-value.

alternative

The alternative hypothesis.

estimate

(If distribution = "normal") A vector with the value of $N_{..}^\omega$, $\mu$ and $\sigma^2$.

method

A character string indicating the type of test performed.

data.name

A character string giving the name of the data.

Details

The null hypothesis is that the data come from a population with independent and identically distributed realizations. The one-sided alternative hypothesis is that the (weighted) number of records is greater (or less) than under the null hypothesis. The (weighted)-number-of-records statistic is calculated according to: $$N_{..}^\omega = \sum_{m=1}^M \sum_{t=1}^T \omega_t I_{tm},$$ where $\omega_t$ are weights given to the different records according to their position in the series and $I_{tm}$ are the record indicators (see I.record).

The statistic $N_{..}^\omega$ is exact Poisson binomial distributed when the $\omega_t$'s only take values in $\{0,1\}$. In any case, it is also approximately normally distributed, with $$Z = \frac{N_{..}^\omega - \mu}{\sigma},$$ where its mean and variance are $$\mu = M \sum_{t=1}^T \omega_t \frac{1}{t},$$ $$\sigma^2 = M \sum_{t=2}^T \omega_t^2 \frac{1}{t} \left(1-\frac{1}{t}\right).$$

If correct = TRUE, then a continuity correction will be employed: $$Z = \frac{N_{..}^\omega \pm 0.5 - \mu}{\sigma},$$ with ``$-$'' if the alternative is greater and ``$+$'' if the alternative is less.

When $M>1$, the expression of the variance under the null hypothesis can be substituted by the sample variance in the $M$ series, $\hat{\sigma}^2$. In this case, the statistic $N_{S,..}^\omega$ is asymptotically $t$ distributed, which is a more robust alternative against serial correlation.

If simulate.p.value = TRUE, the p-value is estimated by Monte Carlo simulations.

The size of the tests is adequate for any values of $T$ and $M$. Some comments and a power study are given by Cebri<U+00E1>n, Castillo-Mateo and As<U+00ED>n (2021).

References

Butler K, Stephens MA (2016). <U+201C>The Distribution of a Sum of Independent Binomial Random Variables.<U+201D> Methodology and Computing in Applied Probability, 19(2), 557-571.

Cebri<U+00E1>n A, Castillo-Mateo J and As<U+00ED>n J (2021). <U+201C>Record Tests to detect non stationarity in the tails with an application to climate change.<U+201D> Unpublished manuscript.

Hong Y (2013). <U+201C>On Computing the Distribution Function for the Poisson Binomial Distribution.<U+201D> Computational Statistics & Data Analysis, 59(1), 41-51.

Examples

Run this code

# NOT RUN {
# Forward Upper records
N.test(ZaragozaSeries)
# Forward Lower records
N.test(ZaragozaSeries, record = "lower", alternative = "less")
# Forward Upper records
N.test(series_rev(ZaragozaSeries), alternative = "less")
# Forward Upper records
N.test(series_rev(ZaragozaSeries), record = "lower")

# Exact test
N.test(ZaragozaSeries, distribution = "poisson-binom")
# Exact test for records in the last decade
N.test(ZaragozaSeries, weights = function(t) ifelse(t < 56, 0, 1), distribution = "poisson-binom")
# Linear weights for a more powerful test (with continuity correction)
N.test(ZaragozaSeries, weights = function(t) t-1, correct = TRUE)

# }

Run the code above in your browser using DataLab