tssdr: Supervised Dimension Reduction for Multivariate Time Series

Description

Supervised dimension reduction for multivariate time series data. There are three different algorithms to choose from. TSIR is a time series version of Sliced Inverse Regression (SIR), TSAVE is a time series version of Sliced Average Variance Estimate (TSAVE) and a hybrid of TSIR and TSAVE is TSSH (Time series SIR SAVE Hybrid). For summary of an object of class tssdr, see summary.tssdr.

Usage

tssdr(y, X, …)
# S3 method for default
tssdr(y, X, algorithm = c("TSIR", "TSAVE", "TSSH"), k = 1:12, H = 10, weight = 0.5,
      method = c("frjd", "rjd"), eps = 1e-06, maxiter = 1000, …)
# S3 method for ts
tssdr(y, X, …)
# S3 method for xts
tssdr(y, X, …)
# S3 method for zoo
tssdr(y, X, …)
# S3 method for tssdr
print(x, digits = 3, ...)
# S3 method for tssdr
components(x, ...)
# S3 method for tssdr
plot(x, main = "The response and the directions", ...)

Arguments

A numeric vector or a time series object of class ts, xts or zoo (same type as X). Missing values are not allowed.

A numeric matrix or a multivariate time series object of class ts, xts or zoo (same type as y). Missing values are not allowed.

algorithm

Algorithm to be used. The options are "TSIR", "TSAVE" and "TSSH". Default is "TSIR".

A vector of lags. It can be any non-zero positive integer, or a vector consisting of them. Default is 1:12.

The number of slices. If "TSSH" is used, $H$ is a 2-vector; the first element is used for TSIR part and the second for TSAVE part. Default is $H = 10$.

weight

Weight $0 \le a \le 1$ for the hybrid method TSSH only. With $a = 1$ it reduces to TSAVE and with $a = 0$ to TSIR. Default is $a = 0.5$.

method

The method to use for the joint diagonalization. The options are "rjd" and "frjd". Default is "frjd".

eps

Convergence tolerance.

maxiter

The maximum number of iterations.

...

Further arguments to be passed to or from methods.

An object of class tssdr

digits

The number of digits when printing an object of class tssdr. Default is 3

main

A title for a plot when printing an object of class tssdr.

Value

A list of class 'tssdr' containing the following components:

The estimated signal separation matrix.

The vector of the used lags.

The estimated directions as time series object standardized to have mean 0 and unit variances.

The mean vector of X.

The Lambda matrix for choosing lags and directions.

The used number of slices.

yname

The name for the response time series $y$.

Xname

The name for the predictor time series $\bf X$.

algorithm

The used algorithm as a character string.

Details

Assume that the $p$-variate time series ${\bf Z}$ with $T$ observations is whitened, i.e. ${\bf Z}={\bf S}^{-1/2}({\bf X}_t - \frac{1}{T}\sum_{t=1}^T {\bf X}_{t})$, for $t = 1, \ldots, T$, where ${\bf S}$ is a sample covariance matrix of ${\bf X}$. Divide $y$ into $H$ disjoint intervals (slices) by its empirical quantiles.

For each lag $j$, denote $y_{j}$ for a vector of the last $n - j$ values of the sliced $y$. Also denote ${\bf Z}_j$ for the first $n - j$ observations of ${\bf Z}$. Then ${\bf Z}_{jh}$ are the disjoint slices of ${\bf Z}_j$ according to the values of $y_{j}$.

Let $T_{jh}$ be the number of observations in ${\bf Z}_{jh}$. Write $\bf \widehat{A}_{jh} = \frac{1}{T_{jh}}\sum_{t = 1}^{T_{jh}}({\bf Z}_{jh})_{t}$ for $t = 1, \ldots, T_jh$, and ${\bf \widehat A}_j = ({\bf \widehat{A}}_{j1}, \ldots, {\bf \widehat{A}}_{jH})'$. Then for algorithm TSIR matrix $${\bf \widehat{M}}_{0j} = {\bf \widehat{Cov}}_{A_j}.$$

Denote $\bf \widehat{Cov}_{jh}$ for a sample covariance matrix of ${\bf Z}_{jh}$. Then for algorithm TSAVE matrix $${\bf \widehat{M}}_{0j} = \frac{1}{H}\sum_{h = 1}^H({\bf I}_p - {\bf \widehat{Cov}_{jh}})^2.$$ $h = 1, \ldots, H$.

For TSSH then matrix $${\bf \widehat{M}}_{2j} = a{\bf \widehat{M}_{1j}} + (1-a){\bf \widehat{M}_{0j}},$$ for a chosen $0 \le a \le 1$. Note that the value of $H$ can be different for TSIR and TSAVE parts.

The algorithms find an orthogonal matrix ${\bf U} = (\bf u_1, \ldots, \bf u_p)'$ by maximizing, for $b = 0, 1$ or $2$, $$\sum_{i \in k} ||diag({\bf U} {\bf \widehat{M}}_{bj} {\bf U}')||^2 = \sum_{i \in 1}^p \sum_{j \in k} ({\bf u}_i' {\bf \widehat{M}}_{bj} {\bf u}_i)^2.$$ for $i = 1, \ldots, p$ and all lags $j$. The final signal separation matrix is then ${\bf W} = {\bf US}^{-1/2}$.

Write $\lambda_{ij} = c({\bf u}_i' {\bf \widehat{M}}_{bj} {\bf u}_i)^2$, where $c$ is chosen in such way that $\sum_{i = 1}^p \sum_{j \in k} \lambda_{ij}= 1.$ for $i = 1, \ldots, p$ and all lags $j$. Then the $(i, j)$:th element of the matrix $\bf L$ is $\lambda_{ij}$.

To make a choice on which lags and directions to keep, see summary.tssdr. Note that when printing a tssdr object, all elements are printed, except the directions S.

References

Matilainen, M., Croux, C., Nordhausen, K. and Oja, H. (2017), Supervised Dimension Reduction for Multivariate Time Series, Econometrics and Statistics, 4, 57--69.

Matilainen, M., Croux, C., Nordhausen, K. and Oja, H. (2019), Sliced Average Variance Estimation for Multivariate Time Series. Statistics: A Journal of Theoretical and Applied Statistics, 53, 630--655.

Li, K.C. (1991), Sliced Inverse Regression for Dimension Reduction, Journal of the American Statistical Association, 86, 316--327.

Cook, R. and Weisberg, S. (1991), Sliced Inverse Regression for Dimension Reduction, Comment. Journal of the American Statistical Association, 86, 328--332.

Examples

Run this code

# NOT RUN {
n <- 10000
A <- matrix(rnorm(9), 3, 3)

x1 <- arima.sim(n = n, list(ar = 0.2))
x2 <- arima.sim(n = n, list(ar = 0.8))
x3 <- arima.sim(n = n, list(ar = 0.3, ma = -0.4))
eps2 <- rnorm(n - 1)
y <- 2*x1[1:(n - 1)] + eps2
X <- ((cbind(x1, x2, x3))[2:n, ]) %*% t(A)

res1 <- tssdr(y, X, algorithm = "TSAVE")
res1
summ1 <- summary(res1, type = "alllag", thres = 0.8)
summ1
plot(summ1)
head(components(summ1))
coef(summ1)

# Hybrid of TSIR and TSAVE. For TSIR part H = 10 and for TSAVE part H = 2.
tssdr(y, X, algorithm = "TSSH", weight = 0.6, H = c(10, 2))
# }

Run the code above in your browser using DataLab