Description
The data is a simulated data set where the data matrix is generated from the latent
factor model
$$Y = n^{1/2}U D V' + E \Sigma^{1/2}$$
where $D$ and $\Sigma$ are diagonal matrices, and $U$ and $V$
are orthogonal. $V'$ means _V transposed_. For the factors, we include one giant
factor, five useful factors, one harmful factor and one undetectable factor.
For more details of the simulation method
used, please refer to Appendix A.1 of Owen and Wang (2015) Bi-cross-validation for factor analysis, http://arxiv.org/abs/1503.03515.Details
The dataset is a list of components:
Y
a data matrix of 200 by 1000, where each row is a sample and each column
is a variableU
the orthogonal factor matrix$U$of size 200 by 8.V
the orthogonal factor matrix$V$of size 1000 by 8.D
the vector of diagonal entries of$D$.Sigma
the vector of diagonal entries of$\Sigma$.oracle.r
the oracle rank (the optimal number of factors that should be kept)
of the factor matrix.