RecordTest-package: RecordTest: A Package for Testing the Classical Record Model

Description

RecordTest provides data-preparation, exploratory data analysis and inference tools based on theory of records to describe the record occurrence and detect trends or non-stationarities in time series.

Arguments

Details

The Classical Record Model:

Record statistics are used primarily to quantify the stochastic behavior of a process at never-seen-before values, either upper or lower. The setup of independent and identically distibuted (IID) continuous random variables (RVs), often called the classical record model, is particularly interesting because the common continuous distribution underlying the IID RVs will not affect the distribution of the variables relative to the record occurrence. Many fields have begun to use the theory of records to study these remarkable events. Particularly productive is the study of record-breaking temperatures and their connection with climate change, but also records in other environmental fields (precipitations, floods, earthquakes, etc.), economy, biology, physics or even sports have been analysed. See Arnold, Balakrishnan and Nagaraja (1998) for an extensive theoretical introduction to the theory of records and in particular the classical record model. See Foster and Stuart (1954), Diersen and Trenkler (1996, 2001) and Cebri<U+00E1>n, Castillo-Mateo and As<U+00ED>n (2021) for some distribution-free tests based on the classical record model. For an easy introduction to RecordTest use vignette("RecordTest").

This package provides tests to study the hypothesis of the classical record model, that is that the record occurrence from a series of values observed at regular time units come from an IID series of continuous RVs. If we have sequences of independent variables with no seasonal component, the hypothesis of IID variables is equivalent to test the hypothesis of homogeneity and stationarity.

The functions in the data-preparation step:

The functions admit a vector X corresponding to a single series as an argument. However, some situations could take advantege of having $M$ uncorrelated vectors to infer from the sample. Then, the input of the functions to perform the statistical tools can be a matrix X where each column corresponds to a vector formed by the values of a series $X_t$, from $t=1,\ldots,T$, so that each row of the matrix correspond to a time $t$.

In many real problems, such as those related to environmental phenomena, the series of variables to analyse show a seasonal behavior, and only one realization is available. In order to be able to apply the suggested tools to detect the existence of a trend, the seasonal component has to be removed and a sample of $M$ uncorrelated series should be obtained. Those problems can be solved by preparing the data adequately. A wide set of tools to carry out a preliminary analysis and to prepare data with a seasonal pattern are implemented in the following functions.

series_record: If only the record times are available.

series_split, series_double: To split the series in several sub-series and remove the seasonal component and autocorrelation.

series_uncor: To extract a subset of uncorrelated series out of the splitted series.

series_ties, series_untie: To deal with record ties.

series_rev: To study the series backwards.

The functions to compute the record statistics are:

I.record: Computes the observed record indicators.

N.record, Nmean.record: Compute the observed number of records up to time $t$.

S.record: Computes the observed number of records at every time $t$, using $M$ series.

p.record: Computes the estimated record probability at every time $t$, using $M$ series.

L.record: Computes the observed record times.

R.record: Computes the observed record values.

The functions to compute the tests:

All the tests performed are distribution-free/nonparametric tests in time series for trend and nonstationarity in the extremes of the distribution based on the null hypothesis that the record indicators are independent and the probabilities of record at time $t$ are $p_t = 1 / t$.

foster.test: Implements Foster-Stuart and Diersen-Trenkler tests.

N.test: Implements tests based on the (weighted) number of records.

brown.method: Brown's method to combine dependent p-values from N.test.

fisher.method: General function to apply Fisher's method to independent p-values.

p.regression.test: Implements a regression test based on the record probabilities.

p.chisq.test: Implements a $\chi^2$-test based on the record probabilities.

lr.test: Implements likelihood ratio tests based on the record indicators.

score.test: Implements score or Lagrange multiplier tests based on the record indicators.

The functions to compute the graphical tools:

records: Shows the series remarking its records.

L.plot: Shows record times in several series.

foster.plot: Shows plots based on Foster-Stuart and Diersen-Trenkler statistics.

N.plot: Shows the number of records.

p.plot: Shows the record probabilities in different plots.

All the tests and graphical tools can be applied to both upper and lower records in the forward and backward directions.

Other functions:

rcrm: Random generation for the classical record model.

dpoisbinom, ppoisbinom, qpoisbinom, rpoisbinom: Density, distribution function, quantile function and random generation for the Poisson binomial distribution. Related to the probability distribution function of the number of records under the null hypothesis.

Example data sets:

There are two example data sets included with this package. It is possible to load these data sets into R using the data function. The data sets have their own help file, which can be accessed by help([dataset_name]). Data included with RecordTest are:

TX_Zaragoza - Daily maximum temperatures at Zaragoza (Spain).

ZaragozaSeries - Splitted and uncorrelated sub-series TX_Zaragoza$TX.

Olympic_records_200m - 200-meter Olympic records from 1900 to 2020.

To see how to cite RecordTest in publications or elsewhere, use citation("RecordTest").

References

Arnold BC, Balakrishnan N, Nagaraja HN (1998). Records. Wiley Series in Probability and Statistics. Wiley, New York.

Cebri<U+00E1>n A, Castillo-Mateo J, As<U+00ED>n J (2021). <U+201C>Record Tests to detect non stationarity in the tails with an application to climate change.<U+201D> Unpublished manuscript.

Diersen J, Trenkler G (1996). <U+201C>Records Tests for Trend in Location.<U+201D> Statistics, 28(1), 1-12.

Diersen J, Trenkler G (2001). <U+201C>Weighted Records Tests for Splitted Series of Observations.<U+201D> In J Kunert, G Trenkler (eds.), Mathematical Statistics with Applications in Biometry: Festschrift in Honour of Prof. Dr. Siegfried Schach, pp. 163<U+2013>178. Lohmar: Josef Eul Verlag.

Foster FG, Stuart A (1954). <U+201C>Distribution-Free Tests in Time-Series Based on the Breaking of Records.<U+201D> Journal of the Royal Statistical Society. Series B (Methodological), 16(1), 1-22.