ESA: Estimate Latent Factor Matrix With Known Number of Factors

Description

Estimate the latent factor matrix and noise variance using early stopping alternation (ESA) given the number of factors.

Usage

ESA(Y, r, X = NULL, center = F, niter = 3, svd.method = "fast")

Arguments

observed data matrix. p is the number of variables and n is the sample size. Dimension is c(n, p)

The number of factors to use

the known predictors of size c(n, k) if any. Default is NULL (no known predictors). k is the number of known covariates.

center

logical, whether to add an intercept term in the model. Default is False.

niter

the number of iterations for ESA. Default is 3.

svd.method

either "fast", "propack" or "standard". "fast" is using the fast.svd function in package corpcor to compute SVD, "propack" is using the propack.svd

`Value`

The returned value is a list with components
estSigmathe diagonal entries of estimated $\Sigma$
  which is a vector of length p
estUthe estimated $U$. Dimension c(n, r)
estDthe estimated diagonal entries of $D$
  which is a vector of length r
estVthe estimated $V$. Dimension is c(p, r)
betathe estimated $beta$ which is a matrix of size c(k, p).
  Return NULL if the argument X is NULL.
estSthe estimated signal (factor) matrix $S$ where
              $$S = 1 \mu' + X \beta + n^{1/2}U D V'$$
muthe sample centers of each variable which is a vector of length
  p. It's an estimate of $\mu$. Return
  NULL if the argument center is False.

`Details`

The model used is
$$Y = 1 \mu' + X \beta + n^{1/2}U D V' + E \Sigma^{1/2}$$
where $D$ and $\Sigma$ are diagonal matrices, $U$ and $V$
are orthogonal and $\mu'$
and $V'$ mean _mu transposed_ and _V transposed_ respectively.
The entries of $E$ are assumed to be i.i.d. standard Gaussian.
The model assumes heteroscedastic noises and especially works well for
high-dimensional data. The method is based on Owen and Wang (2015). Notice that
when nonnull X is given or centering the data is required (which is essentially
adding a known covariate with all $1$), for identifiability, it's required that
$ = 0$ or $<1, u=""> = 0$ respectively. Then the method will first make a rotation
of the data matrix to remove the known predictors or centers, and then use
the latter n - k (or n - k - 1 if centering is required) samples to
estimate the latent factors.

`References`

Art B. Owen and Jingshu Wang(2015), Bi-cross-validation for factor analysis,
http://arxiv.org/abs/1503.03515

`Examples`

Run this codeY <- matrix(rnorm(100), nrow = 10) + 3 * rnorm(10) %*% t(rep(1, 10))
ESA(Y, 1)
Run the code above in your browser using DataLab