HDSReg: High dimensional stochastic regression with latent factors

Description

HDSReg() considers a multivariate time series model which represents a high dimensional vector process as a sum of three terms: a linear regression of some observed regressors, a linear combination of some latent and serially correlated factors, and a vector white noise:$${\bf y}_t = {\bf Dz}_t + {\bf Ax}_t + {\boldsymbol {\epsilon}}_t,$$ where ${\bf y}_t$ and ${\bf z}_t$ are, respectively, observable $p\times 1$ and $m \times 1$ time series, ${\bf x}_t$ is an $r \times 1$ latent factor process, ${\boldsymbol{\epsilon}}_t \sim \mathrm{WN}({\boldsymbol{0}},{\bf \Sigma}_{\epsilon}) $ is a white noise with zero mean and covariance matrix ${\bf \Sigma}_{\epsilon}$ and ${\boldsymbol{\epsilon}}_t$ is uncorrelated with $({\bf z}_t, {\bf x}_t)$, ${\bf D}$ is an unknown regression coefficient matrix, and ${\bf A}$ is an unknown factor loading matrix. This procedure proposed in Chang, Guo and Yao (2015) aims to estimate the unknown regression coefficient matrix ${\bf D}$, the number of factors $r$ and the factor loading matrix ${\bf A}$.

Usage

HDSReg(Y, Z, D = NULL, lag.k = 1, twostep = FALSE)

Value

An object of class "factors" is a list containing the following components:

factor_num: The estimated number of factors $\hat{r}$.
reg.coff.mat: The estimated $p \times m$ regression coefficient matrix $\widetilde{\bf D}$ if D is not given.
loading.mat: The estimated $p \times m$ factor loading matrix ${\bf \widehat{A}}$.
lag.k: the time lag used in function.
method: a character string indicating what method was performed.

Arguments

Y: ${\bf Y} = \{{\bf y}_1, \dots , {\bf y}_n \}'$, a data matrix with $n$ rows and $p$ columns, where $n$ is the sample size and $p$ is the dimension of ${\bf y}_t$.
Z: ${\bf Z} = \{{\bf z}_1, \dots , {\bf z}_n \}'$, a data matrix representing some observed regressors with $n$ rows and $m$ columns, where $n$ is the sample size and $m$ is the dimension of ${\bf z}_t$.
D: A $p\times m$ regression coefficient matrix $\widetilde{\bf D}$. If D = NULL (the default), our procedure will estimate ${\bf D}$ first and let $\widetilde{\bf D}$ be the estimate of ${\bf D}$. If D is given by R users, then $\widetilde{\bf D}={\bf D}$.
lag.k: Time lag $k_0$ used to calculate the nonnegative definte matrix $ \widehat{\mathbf{M}}$: $$\widehat{\mathbf{M}}\ =\ \sum_{k=1}^{k_0}\widehat{\mathbf{\Sigma}}_{\eta}(k)\widehat{\mathbf{\Sigma}}_{\eta}(k)', $$ where $\widehat{\bf \Sigma}_{\eta}(k)$ is the sample autocovariance of $ {\boldsymbol {\eta}}_t = {\bf y}_t - \widetilde{\bf D}{\bf z}_t$ at lag $k$.
twostep: Logical. If FALSE (the default), then standard procedures (see Factors) will be implemented to estimate $r$ and ${\bf A}$. If TRUE, then a two step estimation procedure (see Factors) will be implemented to estimate $r$ and ${\bf A}$.

References

Chang, J., Guo, B. & Yao, Q. (2015). High dimensional stochastic regression with latent factors, endogeneity and nonlinearity, Journal of Econometrics, Vol. 189, pp. 297–312.

Examples

Run this code

n <- 400
p <- 200
m <- 2
r <- 3
X <- mat.or.vec(n,r)
x1 <- arima.sim(model=list(ar=c(0.6)),n=n)
x2 <- arima.sim(model=list(ar=c(-0.5)),n=n)
x3 <- arima.sim(model=list(ar=c(0.3)),n=n)
X <- cbind(x1,x2,x3)
X <- t(X)

Z <- mat.or.vec(m,n)
S1 <- matrix(c(5/8,1/8,1/8,5/8),2,2)
Z[,1] <- c(rnorm(m))
for(i in c(2:n)){
  Z[,i] <- S1%*%Z[, i-1] + c(rnorm(m))
}
D <- matrix(runif(p*m, -2, 2), ncol=m)
A <- matrix(runif(p*r, -2, 2), ncol=r)
eps <- mat.or.vec(n, p)
eps <- matrix(rnorm(n*p), p, n)
Y <- D %*% Z + A %*% X + eps
Y <- t(Y)
Z <- t(Z)
res1 <- HDSReg(Y,Z,D,lag.k=2)
res2 <- HDSReg(Y,Z,lag.k=2)