series_uncor: Subset of Uncorrelated Series

Description

Given a matrix this function extracts a subset of columns which are uncorrelated vector in some sense (see Details).

Usage

series_uncor(
  X,
  return.value = c("series", "indexes"),
  type = c("adjacent", "all"),
  first.last = TRUE,
  m = 1,
  alpha = 0.05,
  ...
)

Arguments

A numeric matrix (or data frame) where the uncorrelated vectors are extracted from.

return.value

A character string indicating the return of the function, "series" for a matrix with uncorrelated columns or "indexes" for a vector with the position of the uncorrelated columns in X.

type

A character string indicating the type of uncorrelation wanted between the extracted series (or columns), "adjacent" or "all" (see Details).

first.last

Logical. Indicates if the first and last columns have also to be uncorrelated (when type = "adjacent").

Integer value giving the starting column.

alpha

Numeric value in \((0,1)\). It gives the significance level of the correlation test where alternative hypothesis is that the true correlation is not equal to 0.

...

Further arguments to be passed to cor.test function (see cor.test for possible arguments).

Value

A matrix or a vector as specified by return.value.

Details

This function is used in the data preparation (or pre-processing) often required to apply the exploratory and inference tools based on theory of records within this package.

Given a matrix X considered as a set of \(M^*\) vectors, which are the columns of X, this function extracts the biggest subset of uncorrelated vectors (columns), using the following procedure: starting from column m, the test cor.test is applied to study the correlation between columns depending on argument type.

If type = "adjacent", the test is computed between m and \(\code{m} + 1, \code{m} + 2, \ldots\) and so on up to find a column \(\code{m} + k\) which is not significantly correlated with column m. Then, the process is repeated starting at column \(\code{m} + k\). All columns are checked.

When the first and last columns may not have a significant correlation, where m is the first column, the parameter first.last should be FALSE. When the first and last columns could be correlated, the function requires first.last = TRUE.

If type = "all", the procedure is similar as above but the new kept column cannot be significant correlated with any other column already kept, not only the previous one. So this option results in a fewer number of columns.

References

Cebri<U+00E1>n A, Castillo-Mateo J, As<U+00ED>n J (2021). <U+201C>Record Tests to detect non stationarity in the tails with an application to climate change.<U+201D> Unpublished manuscript.

Examples

Run this code

# NOT RUN {
# Split Zaragoza series
TxZ <- series_split(TX_Zaragoza$TX)

# Index of uncorrelated columns depending on the criteria
series_uncor(TxZ, return.value = "indexes", type = "adjacent")
series_uncor(TxZ, return.value = "indexes", type = "all")

# Return the set of uncorrelated vectors
ZaragozaSeries <- series_uncor(TxZ)

# }

Run the code above in your browser using DataLab