Performs Foster-Stuart, Diersen-Trenkler and Cebrián-Castillo-Asín records tests for trend in location, variation or the tails. The hypothesis of the classical record model (i.e., of IID continuous RVs) is tested against the alternative hypothesis.
foster.test(
X,
weights = function(t) 1,
statistic = c("D", "d", "S", "s", "U", "L", "W"),
distribution = c("normal", "t"),
alternative = c("greater", "less"),
correct = FALSE,
permutation.test = FALSE,
simulate.p.value = FALSE,
B = 1000
)
A "htest"
object with elements:
Value of the test statistic.
(If distribution = "t"
) Degrees of freedom of
the \(t\) statistic (equal to \(M-1\)).
P-value.
The alternative hypothesis.
(If distribution = "normal"
) A vector with the
value of the statistic, \(\mu\) and \(\sigma^2\). \(\sigma^2\)
is NA
if statistic
is one of "D"
, "S"
or
"W"
(with the exception of "D"
without weights); the
p-value is computed with permutations or Monte Carlo simulations; and
\(T > 500\).
A character string indicating the type of test performed.
A character string giving the name of the data.
A numeric vector, matrix (or data frame).
A function indicating the weight given to the different
records according to their position in the series,
e.g., if function(t) t - 1
then \(\omega_t = t - 1\).
A character string indicating the type of statistic to be
calculated, i.e., one of "D"
, "d"
, "S"
, "s"
,
"U"
, "L"
or "W"
(see Details).
A character string indicating the asymptotic
distribution of the statistic, "normal"
or Student's
"t"
distribution.
A character string indicating the type of alternative
hypothesis, "greater"
number of records or "less"
number of
records.
Logical. Indicates, whether a continuity correction
should be done; defaults to FALSE
. No correction is done if
simulate.p.value = TRUE
.
Logical. Indicates whether to compute p-values by
permutation simulation (Castillo-Mateo et al. 2023). It does not require
that the columns of X
be independent. If TRUE
and
simulate.p.value = TRUE
, permutations take precedence and
permutations are performed.
Logical. Indicates whether to compute p-values by
Monte Carlo simulation. If permutation.test = TRUE
, permutations
take precedence and permutations are performed.
If permutation.test = TRUE
or simulate.p.value = TRUE
,
an integer specifying the number of replicates used in the permutation or
Monte Carlo estimation.
Jorge Castillo-Mateo
In this function, the tests are implemented as given by Foster and Stuart (1954), Diersen and Trenkler (1996, 2001) and some modifications in the standardisation of the previous statistics given by Cebrián, Castillo-Mateo and Asín (2022). The null hypothesis is that the data come from a population with independent and identically distributed realisations. The one-sided alternative hypothesis is that the chosen statistic is greater (or less) than under the null hypothesis. The different statistics are calculated according to:
If statistic == "d"
,
$$\sum_{m=1}^{M} \sum_{t=1}^{T} \omega_t \left( I_{tm}^{(FU)} - I_{tm}^{(FL)}\right).$$
If statistic == "D"
,
$$\sum_{m=1}^{M} \sum_{t=1}^{T} \omega_t \left( I_{tm}^{(FU)} - I_{tm}^{(FL)} - I_{tm}^{(BU)} + I_{tm}^{(BL)}\right).$$
If statistic == "s"
,
$$\sum_{m=1}^{M} \sum_{t=1}^{T} \omega_t \left( I_{tm}^{(FU)} + I_{tm}^{(FL)}\right).$$
If statistic == "S"
,
$$\sum_{m=1}^{M} \sum_{t=1}^{T} \omega_t \left( I_{tm}^{(FU)} + I_{tm}^{(FL)} - I_{tm}^{(BU)} - I_{tm}^{(BL)}\right).$$
If statistic == "U"
,
$$\sum_{m=1}^{M} \sum_{t=1}^{T} \omega_t \left( I_{tm}^{(FU)} - I_{tm}^{(BU)}\right).$$
If statistic == "L"
,
$$\sum_{m=1}^{M} \sum_{t=1}^{T} \omega_t \left( I_{tm}^{(BL)} - I_{tm}^{(FL)}\right).$$
If statistic == "W"
,
$$\sum_{m=1}^{M} \sum_{t=1}^{T} \omega_t \left( I_{tm}^{(FU)} + I_{tm}^{(BL)}\right).$$
Where \(\omega_t\) are weights given to the different records
according to their position in the series, \(I_{tm}\) are the record
indicators (see I.record
), and \((FU)\), \((FL)\),
\((BU)\), and \((BL)\) represent forward upper, forward lower,
backward upper and backward lower records, respectively. The statistics
\(d\), \(D\) and \(W\) may be used for trend in location;
\(s\) and \(S\) may be used for trend in variation; and \(U\) and
\(L\) may be used for trend in the upper and lower tails of the
distribution respectively.
The statistics, say \(X\), are approximately normally distributed, with $$Z = \frac{X - \mu}{\sigma},$$ while the mean \(\mu\) of the particular statistic considered is simple to calculate, its variance \(\sigma^2\) become a cumbersome expression and some are given by Diersen and Trenkler (2001) and all of them can be easily computed out of the expression of the covariances given by Cebrián, Castillo-Mateo and Asín (2022).
If correct = TRUE
, then a continuity correction will be employed:
$$Z = \frac{X \pm 0.5 - \mu}{\sigma},$$
with ``\(-\)'' if the alternative is greater and ``\(+\)'' if the
alternative is less. Not recommended for the statistics with \(\mu=0\).
When \(M>1\), the expression of the variance under the null hypothesis can be substituted by the sample variance in the \(M\) series, \(\hat{\sigma}^2\). In this case, the statistics are asymptotically \(t\) distributed, which is a more robust alternative against serial correlation.
If permutation.test = TRUE
, the p-value is estimated by permutation
simulations. This is the only method of calculating p-values that does not
require that the columns of X
be independent.
If simulate.p.value = TRUE
, the p-value is estimated by Monte Carlo
simulations. If the normal asymptotic statistic
"D"
,
"S"
or "W"
is used when the length of the
series \(T\) is greater than 1000 or 1500, permutations or this approach
are preferable due to the computational cost of calculating the variance
of the statistic under the null hypothesis. The exception is "D"
without weights, which has an alternative algorithm implemented to
calculate the variance quickly.
Castillo-Mateo J, Cebrián AC, Asín J (2023). “Statistical Analysis of Extreme and Record-Breaking Daily Maximum Temperatures in Peninsular Spain during 1960--2021.” Atmospheric Research, 293, 106934. tools:::Rd_expr_doi("10.1016/j.atmosres.2023.106934").
Cebrián AC, Castillo-Mateo J, Asín J (2022). “Record Tests to Detect Non Stationarity in the Tails with an Application to Climate Change.” Stochastic Environmental Research and Risk Assessment, 36(2), 313-330. tools:::Rd_expr_doi("10.1007/s00477-021-02122-w").
Diersen J, Trenkler G (1996). “Records Tests for Trend in Location.” Statistics, 28(1), 1-12. tools:::Rd_expr_doi("10.1080/02331889708802543").
Diersen J, Trenkler G (2001). “Weighted Records Tests for Splitted Series of Observations.” In J Kunert, G Trenkler (eds.), Mathematical Statistics with Applications in Biometry: Festschrift in Honour of Prof. Dr. Siegfried Schach, pp. 163–178. Lohmar: Josef Eul Verlag.
Foster FG, Stuart A (1954). “Distribution-Free Tests in Time-Series Based on the Breaking of Records.” Journal of the Royal Statistical Society B, 16(1), 1-22. tools:::Rd_expr_doi("10.1111/j.2517-6161.1954.tb00143.x").
foster.plot
, N.plot
,
N.test
# D-statistic
foster.test(ZaragozaSeries)
# D-statistic with linear weights
foster.test(ZaragozaSeries, weights = function(t) t - 1)
# S-statistic with linear weights
foster.test(ZaragozaSeries, statistic = "S", weights = function(t) t - 1)
# D-statistic with weights and t approach
foster.test(ZaragozaSeries, distribution = "t", weights = function(t) t - 1)
# U-statistic with weights (upper tail)
foster.test(ZaragozaSeries, statistic = "U", weights = function(t) t - 1)
# L-statistic with weights (lower tail)
foster.test(ZaragozaSeries, statistic = "L", weights = function(t) t - 1)
Run the code above in your browser using DataLab