Learn R Programming

HDTSA (version 1.0.5)

Factors: Factor analysis for vector time series

Description

Factors() deals with factor modeling for high-dimensional time series proposed in Lam and Yao (2012):$${\bf y}_t = {\bf Ax}_t + {\boldsymbol{\epsilon}}_t, $$ where \({\bf x}_t\) is an \(r \times 1\) latent process with (unknown) \(r \leq p\), \({\bf A}\) is a \(p \times r\) unknown constant matrix, and \( {\boldsymbol{\epsilon}}_t\) is a vector white noise process. The number of factors \(r\) and the factor loadings \({\bf A}\) can be estimated in terms of an eigenanalysis for a nonnegative definite matrix, and is therefore applicable when the dimension of \({\bf y}_t\) is on the order of a few thousands. This function aims to estimate the number of factors \(r\) and the factor loading matrix \({\bf A}\).

Usage

Factors(
  Y,
  lag.k = 5,
  thresh = FALSE,
  delta = 2 * sqrt(log(ncol(Y))/nrow(Y)),
  twostep = FALSE
)

Value

An object of class "factors", which contains the following components:

factor_num

The estimated number of factors \(\hat{r}\).

loading.mat

The estimated \(p \times \hat{r}\) factor loading matrix \(\hat{\bf A}\).

X

The \(n\times \hat{r}\) matrix \(\hat{\bf X}=(\hat{\bf x}_1,\dots,\hat{\bf x}_n)'\) with \(\hat{\bf x}_t = \hat{\bf A}'\hat{\bf y}_t\).

lag.k

The time lag used in function.

Arguments

Y

An \(n \times p\) data matrix \({\bf Y} = ({\bf y}_1, \dots , {\bf y}_n )'\), where \(n\) is the number of the observations of the \(p \times 1\) time series \(\{{\bf y}_t\}_{t=1}^n\).

lag.k

The time lag \(K\) used to calculate the nonnegative definite matrix \( \hat{\mathbf{M}}\): $$\hat{\mathbf{M}}\ =\ \sum_{k=1}^{K} T_\delta\{\hat{\mathbf{\Sigma}}_y(k)\} T_\delta\{\hat{\mathbf{\Sigma}}_y(k)\}'\,, $$ where \(\hat{\bf \Sigma}_y(k)\) is the sample autocovariance of \( {\bf y}_t\) at lag \(k\) and \(T_\delta(\cdot)\) is a threshold operator with the threshold level \(\delta \geq 0\). See 'Details'. The default is 5.

thresh

Logical. If thresh = FALSE (the default), no thresholding will be applied to estimate \(\hat{\mathbf{M}}\). If thresh = TRUE, \(\delta\) will be set through delta.

delta

The value of the threshold level \(\delta\). The default is \( \delta = 2 \sqrt{n^{-1}\log p}\).

twostep

Logical. If twostep = FALSE (the default), the standard procedure [See Section 2.2 in Lam and Yao (2012)] for estimating \(r\) and \({\bf A}\) will be implemented. If twostep = TRUE, the two-step estimation procedure [See Section 4 in Lam and Yao (2012)] for estimating \(r\) and \({\bf A}\) will be implemented.

Details

The threshold operator \(T_\delta(\cdot)\) is defined as \(T_\delta({\bf W}) = \{w_{i,j}1(|w_{i,j}|\geq \delta)\}\) for any matrix \({\bf W}=(w_{i,j})\), with the threshold level \(\delta \geq 0\) and \(1(\cdot)\) representing the indicator function. We recommend to choose \(\delta=0\) when \(p\) is fixed and \(\delta>0\) when \(p \gg n\).

References

Lam, C., & Yao, Q. (2012). Factor modelling for high-dimensional time series: Inference for the number of factors. The Annals of Statistics, 40, 694--726. tools:::Rd_expr_doi("doi:10.1214/12-AOS970").

Examples

Run this code
# Example 1 (Example in Section 3.3 of lam and Yao 2012)
## Generate y_t
p <- 200
n <- 400
r <- 3
X <- mat.or.vec(n, r)
A <- matrix(runif(p*r, -1, 1), ncol=r)
x1 <- arima.sim(model=list(ar=c(0.6)), n=n)
x2 <- arima.sim(model=list(ar=c(-0.5)), n=n)
x3 <- arima.sim(model=list(ar=c(0.3)), n=n)
eps <- matrix(rnorm(n*p), p, n)
X <- t(cbind(x1, x2, x3))
Y <- A %*% X + eps
Y <- t(Y)

fac <- Factors(Y,lag.k=2)
r_hat <- fac$factor_num
loading_Mat <- fac$loading.mat

Run the code above in your browser using DataLab