Randomized accelerated implementation of SPCA, using variable projection as an optimization strategy.
rspca(X, k = NULL, alpha = 1e-04, beta = 1e-04, center = TRUE,
scale = FALSE, max_iter = 1000, tol = 1e-05, o = 20, q = 2,
verbose = TRUE)
array_like; a real \((n, p)\) input matrix (or data frame) to be decomposed.
integer; specifies the target rank, i.e., the number of components to be computed.
float; Sparsity controlling parameter. Higher values lead to sparser components.
float; Amount of ridge shrinkage to apply in order to improve conditioning.
bool; logical value which indicates whether the variables should be shifted to be zero centered (TRUE by default).
bool; logical value which indicates whether the variables should be scaled to have unit variance (FALSE by default).
integer; maximum number of iterations to perform before exiting.
float; stopping tolerance for the convergence criterion.
integer; oversampling parameter (default \(o=20\)).
integer; number of additional power iterations (default \(q=2\)).
bool; logical value which indicates whether progress is printed.
spca
returns a list containing the following three components:
array_like; sparse loadings (weight) vector; \((p, k)\) dimensional array.
array_like; the approximated inverse transform; \((p, k)\) dimensional array.
array_like; the principal component scores; \((n, k)\) dimensional array.
array_like; the approximated eigenvalues; \((k)\) dimensional array.
array_like; the centering and scaling used.
Sparse principal component analysis is a modern variant of PCA. Specifically, SPCA attempts to find sparse weight vectors (loadings), i.e., a weight vector with only a few 'active' (nonzero) values. This approach leads to an improved interpretability of the model, because the principal components are formed as a linear combination of only a few of the original variables. Further, SPCA avoids overfitting in a high-dimensional data setting where the number of variables \(p\) is greater than the number of observations \(n\).
Such a parsimonious model is obtained by introducing prior information like sparsity promoting regularizers. More concreatly, given an \((n,p)\) data matrix \(X\), SPCA attemps to minimize the following objective function:
$$ f(A,B) = \frac{1}{2} \| X - X B A^\top \|^2_F + \psi(B) $$
where \(B\) is the sparse weight (loadings) matrix and \(A\) is an orthonormal matrix. \(\psi\) denotes a sparsity inducing regularizer such as the LASSO (\(\ell_1\) norm) or the elastic net (a combination of the \(\ell_1\) and \(\ell_2\) norm). The principal components \(Z\) are formed as
$$ Z = X B $$
and the data can be approximately rotated back as
$$ \tilde{X} = Z A^\top $$
The print and summary method can be used to present the results in a nice format.
[1] N. B. Erichson, P. Zheng, K. Manohar, S. Brunton, J. N. Kutz, A. Y. Aravkin. "Sparse Principal Component Analysis via Variable Projection." Submitted to IEEE Journal of Selected Topics on Signal Processing (2018). (available at `arXiv https://arxiv.org/abs/1804.00341).
[1] N. B. Erichson, S. Voronin, S. Brunton, J. N. Kutz. "Randomized matrix decompositions using R." Submitted to Journal of Statistical Software (2016). (available at `arXiv http://arxiv.org/abs/1608.02148).
# NOT RUN {
# Create artifical data
m <- 10000
V1 <- rnorm(m, 0, 290)
V2 <- rnorm(m, 0, 300)
V3 <- -0.1*V1 + 0.1*V2 + rnorm(m,0,100)
X <- cbind(V1,V1,V1,V1, V2,V2,V2,V2, V3,V3)
X <- X + matrix(rnorm(length(X),0,1), ncol = ncol(X), nrow = nrow(X))
# Compute SPCA
out <- rspca(X, k=3, alpha=1e-3, beta=1e-3, center = TRUE, scale = FALSE, verbose=0)
print(out)
summary(out)
# }
Run the code above in your browser using DataLab