KMHFA: Estimating the Pair of Factor Numbers via Eigenvalue Ratios or Rank Minimization.

Description

The function is to estimate the pair of factor numbers via eigenvalue-ratio corresponding to RMFA method or rank minimization and eigenvalue-ratio corresponding to Iterative Huber Regression (IHR).

Usage

KMHFA(X, W1 = NULL, W2 = NULL, kmax, method, max_iter = 100, c = 1e-04, ep = 1e-04)

Value

$k_1$: The estimated row factor number.
$k_2$: The estimated column factor number.

Arguments

X

Input an array with $T \times p_1 \times p_2$, where $T$ is the sample size, $p_1$ is the the row dimension of each matrix observation and $p_2$ is the the column dimension of each matrix observation.

W1

Only if method="E_RM" or method="E_ER", the inital value of row loadings matrix. The default is NULL, which is randomly chosen and all entries from a standard normal distribution.

W2

Only if method="E_RM" or method="E_ER", the inital value of column loadings matrix. The default is NULL, which is randomly chosen and all entries from a standard normal distribution.

kmax

The user-supplied maximum factor numbers. Here it means the upper bound of the number of row factors and column factors.

method

Character string, specifying the type of the estimation method to be used.

"P",: the robust iterative eigenvalue-ratio based on RMFA

"E_RM",

the rank-minimization based on IHR

"E_ER",

the eigenvalue-ratio based on IHR

max_iter

Only if method="E_RM" or method="E_ER", the maximum number of iterations in the iterative Huber regression algorithm. The default is 100.

A constant to avoid vanishing denominators. The default is $10^{-4}$.

Only if method="E_RM" or method="E_ER", the stopping critetion parameter in the iterative Huber regression algorithm. The default is $10^{-4} \times Tp_1 p_2$.

Author

Yong He, Changwei Zhao, Ran Zhao.

Details

If method="P", the number of factors $k_1$ and $k_2$ are estimated by $$\hat{k}_1 = \arg \max_{j \leq k_{max}} \frac{\lambda _j (\bold{M}_c^w)}{\lambda _{j+1} (\bold{M}_c^w)}, \hat{k}_2 = \arg \max_{j \leq k_{max}} \frac{\lambda _j (\bold{M}_r^w)}{\lambda _{j+1} (\bold{M}_r^w)},$$ where $k_{max}$ is a predetermined value larger than $k_1$ and $k_2$. $\lambda _j(\cdot)$ is the j-th largest eigenvalue of a nonnegative definitive matrix. See the function MHFA for the definition of $\bold{M}_c^w$ and $\bold{M}_r^w$. For details, see He et al. (2023).

Define $D=\min({\sqrt{Tp_1}},\sqrt{Tp_2},\sqrt{p_1 p_2})$, $$\hat{\bold{\Sigma}}_1=\frac{1}{T}\sum_{t=1}^T\hat{\bold{F}}_t \hat{\bold{F}}_t^\top, \hat{\bold{\Sigma}}_2=\frac{1}{T}\sum_{t=1}^T\hat{\bold{F}}_t^\top \hat{\bold{F}}_t,$$ where $\hat{\bold{F}}_t, t=1, \dots, T$ is estimated by IHR under the number of factor is $k_{max}$.

If method="E_RM", the number of factors $k_1$ and $k_2$ are estimated by $$\hat{k}_1=\sum_{i=1}^{k_{max}}I\left(\mathrm{diag}(\hat{\bold{\Sigma}}_1)>P_1\right), \hat{k}_2=\sum_{j=1}^{k_{max}}I\left(\mathrm{diag}(\hat{\bold{\Sigma}}_2) > P_2\right),$$ where $I$ is the indicator function. In practice, $P_1$ is set as $\max \left(\mathrm{diag}(\hat{\bold{\Sigma}}_1)\right) \cdot D^{-2/3}$, $P_2$ is set as $\max \left(\mathrm{diag}(\hat{\bold{\Sigma}}_2)\right) \cdot D^{-2/3}$.

If method="E_ER", the number of factors $k_1$ and $k_2$ are estimated by $$\hat{k}_1 = \arg \max_{i \leq k_{max}} \frac{\lambda _i (\hat{\bold{\Sigma}}_1)}{\lambda _{i+1} (\hat{\bold{\Sigma}}_1)+cD^{-2}}, \hat{k}_2 = \arg \max_{j \leq k_{max}} \frac{\lambda _j (\hat{\bold{\Sigma}}_2)}{\lambda _{j+1} (\hat{\bold{\Sigma}}_2)+cD^{-2}}.$$

References

He, Y., Kong, X., Yu, L., Zhang, X., & Zhao, C. (2023). Matrix factor analysis: From least squares to iterative projection. Journal of Business & Economic Statistics, 1-26.

He, Y., Kong, X. B., Liu, D., & Zhao, R. (2023). Robust Statistical Inference for Large-dimensional Matrix-valued Time Series via Iterative Huber Regression. <arXiv:2306.03317>.

Examples

Run this code

set.seed(11111)
   T=20;p1=20;p2=20;k1=3;k2=3
   R=matrix(runif(p1*k1,min=-1,max=1),p1,k1)
   C=matrix(runif(p2*k2,min=-1,max=1),p2,k2)
   X=array(0,c(T,p1,p2))
   Y=X;E=Y
   F=array(0,c(T,k1,k2))
   for(t in 1:T){
     F[t,,]=matrix(rnorm(k1*k2),k1,k2)
     E[t,,]=matrix(rnorm(p1*p2),p1,p2)
     Y[t,,]=R%*%F[t,,]%*%t(C)
   }
   X=Y+E
   
   KMHFA(X, kmax=6, method="P")
   # \donttest{
   KMHFA(X, W1 = NULL, W2 = NULL, 6, "E_RM")
   KMHFA(X, W1 = NULL, W2 = NULL, 6, "E_ER")
   # }

Run the code above in your browser using DataLab