whiten
transforms a multivariate K-dimensional signal \(\mathbf{X}\) with mean
\(\boldsymbol \mu_X\) and covariance matrix \(\Sigma_{X}\) to a whitened
signal \(\mathbf{U}\) with mean \(\boldsymbol 0\) and \(\Sigma_U = I_K\).
Thus it centers the signal and makes it contemporaneously uncorrelated.
See Details.
check_whitened
checks if data has been whitened; i.e., if it has
zero mean, unit variance, and is uncorrelated.
sqrt_matrix
computes the square root \(\mathbf{B}\) of a square matrix
\(\mathbf{A}\). The matrix \(\mathbf{B}\) satisfies
\(\mathbf{B} \mathbf{B} = \mathbf{A}\).
whiten(data)check_whitened(data, check.attribute.only = TRUE)
sqrt_matrix(mat, return.sqrt.only = TRUE, symmetric = FALSE)
\(n \times K\) array representing n
observations of
K
variables.
logical; if TRUE
it checks the
attribute only. This is much faster (it just needs to look up one attribute
value), but it might not surface silent bugs. For sake of performance
the package uses the attribute version by default. However, for
testing/debugging the full computational version can be used.
a square \(K \times K\) matrix.
logical; if TRUE
(default) it returns only the square root matrix;
if FALSE
it returns other auxiliary results (eigenvectors and
eigenvalues, and inverse of the square root matrix).
logical; if TRUE
the eigen
-solver assumes
that the matrix is symmetric (which makes it much faster). This is in particular
useful for a covariance matrix (which is used in whiten
). Default: FALSE
.
whiten
returns a list with the whitened data, the transformation,
and other useful quantities.
check_whitened
throws an error if the input is not
whiten
ed, and returns (invisibly) the data with an attribute 'whitened'
equal to TRUE
. This allows to simply update data to have the
attribute and thus only check it once on the actual data (slow) but then
use the attribute lookup (fast).
sqrt_matrix
returns an \(n \times n\) matrix. If \(\mathbf{A}\)
is not semi-positive definite it returns a complex-valued \(\mathbf{B}\)
(since square root of negative eigenvalues are complex).
If return.sqrt.only = FALSE
then it returns a list with:
eigenvalues of \(\mathbf{A}\),
eigenvectors of \(\mathbf{A}\),
square root matrix \(\mathbf{B}\),
inverse of \(\mathbf{B}\).
whiten
uses zero component analysis (ZCA) (aka zero-phase whitening filters)
to whiten the data; i.e., it uses the
inverse square root of the covariance matrix of \(\mathbf{X}\) (see
sqrt_matrix
) as the whitening transformation.
This means that on top of PCA, the uncorrelated principal components are
back-transformed to the original space using the
transpose of the eigenvectors. The advantage is that this makes them comparable
to the original \(\mathbf{X}\). See References for details.
The square root of a quadratic \(n \times n\) matrix \(\mathbf{A}\) can be computed by using the eigen-decomposition of \(\mathbf{A}\) $$ \mathbf{A} = \mathbf{V} \Lambda \mathbf{V}', $$ where \(\Lambda\) is an \(n \times n\) matrix with the eigenvalues \(\lambda_1, \ldots, \lambda_n\) in the diagonal. The square root is simply \(\mathbf{B} = \mathbf{V} \Lambda^{1/2} \mathbf{V}'\) where \(\Lambda^{1/2} = diag(\lambda_1^{1/2}, \ldots, \lambda_n^{1/2})\).
Similarly, the inverse square root is defined as \(\mathbf{A}^{-1/2} = \mathbf{V} \Lambda^{-1/2} \mathbf{V}'\), where \(\Lambda^{-1/2} = diag(\lambda_1^{-1/2}, \ldots, \lambda_n^{-1/2})\) (provided that \(\lambda_i \neq 0\)).
See appendix in http://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf.
See http://ufldl.stanford.edu/wiki/index.php/Implementing_PCA/Whitening.
# NOT RUN {
XX <- matrix(rnorm(100), ncol = 2) %*% matrix(runif(4), ncol = 2)
cov(XX)
UU <- whiten(XX)$U
cov(UU)
# }
Run the code above in your browser using DataLab