RecordTest provides data-preparation, exploratory data analysis and inference tools based on theory of records to describe the record occurrence and detect trends or non-stationarities in time series.
The Classical Record Model:
Record statistics are used primarily to quantify the stochastic behavior
of a process at never-seen-before values, either upper or lower. The setup
of independent and identically distibuted (IID) continuous random
variables (RVs), often called the classical record model, is
particularly interesting because the common continuous distribution
underlying the IID RVs will not affect the distribution of the variables
relative to the record occurrence.
Many fields have begun to use the theory of records to study these
remarkable events. Particularly productive is the study of
record-breaking temperatures and their connection with climate change,
but also records in other environmental fields (precipitations, floods,
earthquakes, etc.), economy, biology, physics or even sports have been
analysed.
See Arnold, Balakrishnan and Nagaraja (1998) for an extensive theoretical
introduction to the theory of records and in particular the classical
record model. See Foster and Stuart (1954), Diersen and Trenkler (1996,
2001) and Cebri<U+00E1>n, Castillo-Mateo and As<U+00ED>n (2021) for some
distribution-free tests based on the classical record model. For an easy
introduction to RecordTest use vignette("RecordTest")
.
This package provides tests to study the hypothesis of the classical record model, that is that the record occurrence from a series of values observed at regular time units come from an IID series of continuous RVs. If we have sequences of independent variables with no seasonal component, the hypothesis of IID variables is equivalent to test the hypothesis of homogeneity and stationarity.
The functions in the data-preparation step:
The functions admit a vector X
corresponding to a single series as
an argument. However, some situations could take advantege of having
\(M\) uncorrelated vectors to infer from the sample. Then, the input of
the functions to perform the statistical tools can be a matrix X
where each column corresponds to a vector formed by the values of a
series \(X_t\), from \(t=1,\ldots,T\), so that each row of the matrix
correspond to a time \(t\).
In many real problems, such as those related to environmental phenomena, the series of variables to analyse show a seasonal behavior, and only one realization is available. In order to be able to apply the suggested tools to detect the existence of a trend, the seasonal component has to be removed and a sample of \(M\) uncorrelated series should be obtained. Those problems can be solved by preparing the data adequately. A wide set of tools to carry out a preliminary analysis and to prepare data with a seasonal pattern are implemented in the following functions.
series_record
: If only the record times are available.
series_split
, series_double
: To split the
series in several sub-series and remove the seasonal component and
autocorrelation.
series_uncor
: To extract a subset of uncorrelated series
out of the splitted series.
series_ties
, series_untie
: To deal with record
ties.
series_rev
: To study the series backwards.
The functions to compute the record statistics are:
I.record
: Computes the observed record indicators.
N.record
, Nmean.record
: Compute the observed
number of records up to time \(t\).
S.record
: Computes the observed number of records at every
time \(t\), using \(M\) series.
p.record
: Computes the estimated record probability at every
time \(t\), using \(M\) series.
L.record
: Computes the observed record times.
R.record
: Computes the observed record values.
The functions to compute the tests:
All the tests performed are distribution-free/nonparametric tests in time series for trend and nonstationarity in the extremes of the distribution based on the null hypothesis that the record indicators are independent and the probabilities of record at time \(t\) are \(p_t = 1 / t\).
foster.test
: Implements Foster-Stuart and Diersen-Trenkler
tests.
N.test
: Implements tests based on the (weighted) number of
records.
brown.method
: Brown's method to combine dependent p-values
from N.test
.
fisher.method
: General function to apply Fisher's method to
independent p-values.
p.regression.test
: Implements a regression test based on the
record probabilities.
p.chisq.test
: Implements a \(\chi^2\)-test based on the
record probabilities.
lr.test
: Implements likelihood ratio tests based on the
record indicators.
score.test
: Implements score or Lagrange multiplier
tests based on the record indicators.
The functions to compute the graphical tools:
records
: Shows the series remarking its records.
L.plot
: Shows record times in several series.
foster.plot
: Shows plots based on Foster-Stuart and
Diersen-Trenkler statistics.
N.plot
: Shows the number of records.
p.plot
: Shows the record probabilities in different plots.
All the tests and graphical tools can be applied to both upper and lower records in the forward and backward directions.
Other functions:
rcrm
: Random generation for the classical record model.
dpoisbinom
, ppoisbinom
,
qpoisbinom
, rpoisbinom
: Density, distribution
function, quantile function and random generation for the Poisson binomial
distribution. Related to the probability distribution function of the
number of records under the null hypothesis.
Example data sets:
There are two example data sets included with this package. It is possible
to load these data sets into R using the data
function. The
data sets have their own help file, which can be accessed by
help([dataset_name])
.
Data included with RecordTest are:
TX_Zaragoza
- Daily maximum temperatures at Zaragoza
(Spain).
ZaragozaSeries
- Splitted and uncorrelated sub-series
TX_Zaragoza$TX
.
Olympic_records_200m
- 200-meter Olympic records from 1900
to 2020.
To see how to cite RecordTest in publications or elsewhere,
use citation("RecordTest")
.
Arnold BC, Balakrishnan N, Nagaraja HN (1998). Records. Wiley Series in Probability and Statistics. Wiley, New York.
Cebri<U+00E1>n A, Castillo-Mateo J, As<U+00ED>n J (2021). <U+201C>Record Tests to detect non stationarity in the tails with an application to climate change.<U+201D> Unpublished manuscript.
Diersen J, Trenkler G (1996). <U+201C>Records Tests for Trend in Location.<U+201D> Statistics, 28(1), 1-12.
Diersen J, Trenkler G (2001). <U+201C>Weighted Records Tests for Splitted Series of Observations.<U+201D> In J Kunert, G Trenkler (eds.), Mathematical Statistics with Applications in Biometry: Festschrift in Honour of Prof. Dr. Siegfried Schach, pp. 163<U+2013>178. Lohmar: Josef Eul Verlag.
Foster FG, Stuart A (1954). <U+201C>Distribution-Free Tests in Time-Series Based on the Breaking of Records.<U+201D> Journal of the Royal Statistical Society. Series B (Methodological), 16(1), 1-22.