foreca: Forecastable Component Analysis

Description

foreca performs Forecastable Component Analysis (ForeCA) on $\mathbf{X}_t$ -- a $K$-dimensional time series with $T$ observations. Users should only call foreca, rather than foreca.one_weightvector or foreca.multiple_weightvectors.

foreca.one_weightvector is a wrapper around several algorithms that solve the ForeCA optimization problem for a single weightvector $\mathbf{w}_i$ and whitened time series $\mathbf{U}_t$.

foreca.multiple_weightvectors applies foreca.one_weightvector iteratively to $\mathbf{U}_t$ in order to obtain multiple weightvectors that yield most forecastable, uncorrelated signals.

Usage

foreca(series, n.comp = 2, algorithm.control = list(type = "EM"), ...)
foreca.one_weightvector(
  U,
  f.U = NULL,
  spectrum.control = list(),
  entropy.control = list(),
  algorithm.control = list(),
  keep.all.optima = FALSE,
  dewhitening = NULL,
  ...
)
foreca.multiple_weightvectors(
  U,
  spectrum.control = list(),
  entropy.control = list(),
  algorithm.control = list(),
  n.comp = 2,
  plot = FALSE,
  dewhitening = NULL,
  ...
)

Arguments

series

a $T \times K$ array with T observations from the $K$-dimensional time series $\mathbf{X}_t$. Can be a matrix, data.frame, or a multivariate ts object.

n.comp

positive integer; number of components to be extracted. Default: 2.

algorithm.control

list; control settings for any iterative ForeCA algorithm. See complete_algorithm_control for details.

...

additional arguments passed to available ForeCA algorithms.

a $T \times K$ array with T observations from the $K$-dimensional whitened (whiten) time series $\mathbf{U}_t$. Can be a matrix, data.frame, or a multivariate ts object.

f.U

multivariate spectrum of class 'mvspectrum' with normalize = TRUE.

spectrum.control

list; control settings for spectrum estimation. See complete_spectrum_control for details.

entropy.control

list; control settings for entropy estimation. See complete_entropy_control for details.

keep.all.optima

logical; if TRUE, it keeps the optimal solutions of each random start. Default: FALSE (only returns the best solution).

dewhitening

optional; if provided (returned by whiten) then it uses the dewhitening transformation to obtain the original series $\mathbf{X}_t$ and it uses that vector (normalized) as the initial weightvector which corresponds to the series $\mathbf{X}_{t,i}$ with larges Omega.

plot

logical; if TRUE a plot of the current optimal solution $\mathbf{w}_i^*$ will be shown and updated for each iteration $i = 1, ..., $ n.comp of any iterative algorithm. Default: FALSE.

Value

An object of class foreca, which is similar to the output from princomp, with the following components (amongst others):

center: sample mean $\widehat{\mu}_X$ of each series,
whitening: whitening matrix of size $K \times K$ from whiten: $\mathbf{U}_t = (\mathbf{X}_t - \widehat{\mu}_X) \cdot whitening$; note that $\mathbf{X}_t$ is centered prior to the whitening transformation,
weightvectors: orthonormal matrix of size $K \times n.comp$, which converts whitened data to n.comp forecastable components (ForeCs) $\mathbf{F}_t = \mathbf{U}_t \cdot weightvectors$,
loadings: combination of whitening $\times$ weightvectors to obtain the final loadings for the original data: $\mathbf{F}_t = (\mathbf{X}_t - \widehat{\mu}_X) \cdot whitening \cdot weightvectors$; again, it centers $\mathbf{X}_t$ first,
loadings.normalized: normalized loadings (unit norm). Note though that if you use these normalized loadings the resulting signals do not have variance 1 anymore.
scores: n.comp forecastable components $\mathbf{F}_t$. They have mean 0, variance 1, and are uncorrelated.
Omega: forecastability score of each ForeC of $\mathbf{F}_t$.

ForeCs are ordered from most to least forecastable (according to Omega).

Warning

Estimating Omega directly from the ForeCs $\mathbf{F}_t$ can be different to the reported $Omega estimates from foreca. Here is why:

In theory $f_y(\lambda)$ of a linear combination $y_t = \mathbf{X}_t \mathbf{w}$ can be analytically computed from the multivariate spectrum $f_{\mathbf{X}}(\lambda)$ by the quadratic form $f_y(\lambda) = \mathbf{w}' f_{\mathbf{X}}(\lambda) \mathbf{w}$ for all $\lambda$ (see spectrum_of_linear_combination).

In practice, however, this identity does not hold always exactly since (often data-driven) control setting for spectrum estimation are not identical for the high-dimensional, noisy $\mathbf{X}_t$ and the combined univariate time series $y_t$ (which is usually more smooth, less variable). Thus estimating $\widehat{f}_y$ directly from $y_t$ can give slightly different estimates to computing it as $\mathbf{w}'\widehat{f}_{\mathbf{X}}\mathbf{w}$. Consequently also Omega estimates can be different.

In general, these differences are small and have no relevant implications for estimating ForeCs. However, especially for rare occasions, the obtained ForeCs can have smaller Omega than the maximum Omega of the original series. In such a case users should not re-estimate $\Omega$ from the resulting ForeCs $\mathbf{F}_t$, but access them via $Omega provided by 'foreca' output (the univariate estimates are stored in $Omega.univ).

References

Goerg, G. M. (2013). “Forecastable Component Analysis”. Journal of Machine Learning Research (JMLR) W&CP 28 (2): 64-72, 2013. Available at http://jmlr.org/proceedings/papers/v28/goerg13.html.

Examples

Run this code

# NOT RUN {
XX <- diff(log(EuStockMarkets[c(100:200),])) * 100
plot(ts(XX))
# }
# NOT RUN {
ff <- foreca(XX[,1:4], n.comp = 2, plot = TRUE)
ff
summary(ff)
plot(ff)
# }
# NOT RUN {

PW <- whiten(XX)
one.weight.em <- foreca.one_weightvector(U = PW$U,
                                        dewhitening = PW$dewhitening,
                                        algorithm.control =
                                          list(num.starts = 2,
                                               type = "EM"),
                                        spectrum.control =
                                          list(method = 'wosa'))
plot(one.weight.em)

# }
# NOT RUN {
PW <- whiten(XX)
ff <- foreca.multiple_weightvectors(PW$U, n.comp = 2,
                                    dewhitening = PW$dewhitening)
ff
plot(ff$scores)
# }

Run the code above in your browser using DataLab