nscumcomp: Non-Negative and Sparse Cumulative PCA

Description

Performs a PCA-like analysis on the given data matrix, where non-negativity and/or sparsity constraints are enforced on the principal axes (PAs). In contrast to regular PCA, which greedily maximises the variance of each principal component (PC), nscumcomp jointly optimizes the components such that the cumulative variance of all PCs is maximal.

Usage

nscumcomp(x, ...)
# S3 method for default
nscumcomp(x, ncomp = min(dim(x)), omega = rep(1, nrow(x)),
  k = d * ncomp, nneg = FALSE, gamma = 0, center = TRUE,
  scale. = FALSE, nrestart = 5, em_tol = 0.001, em_maxiter = 20,
  verbosity = 0, ...)
# S3 method for formula
nscumcomp(formula, data = NULL, subset, na.action, ...)

Arguments

a numeric matrix or data frame which provides the data for the analysis.

...

arguments passed to or from other methods.

ncomp

the number of principal components (PCs) to be computed. The default is to compute a full basis for x.

omega

a vector with as many entries as there are data samples, to perform weighted PCA (analogous to weighted least-squares regression). The default is an equal weighting of all samples.

an upper bound on the total number of non-zero loadings of the pseudo-rotation matrix $\mathbf{W}$. k is increased if necessary to ensure at least one non-zero coefficient per principal axis.

nneg

a logical value indicating whether the loadings should be non-negative, i.e. the PAs should be constrained to the non-negative orthant.

gamma

a non-negative penalty on the divergence from orthonormality of the pseudo-rotation matrix. The default is not to penalize, but a positive value is sometimes necessary to avoid PAs collapsing onto each other.

center

a logical value indicating whether the empirical mean of (the columns of) x should be subtracted. Alternatively, a vector of length equal to the number of columns of x can be supplied. The value is passed to scale.

scale.

a logical value indicating whether the columns of x should be scaled to have unit variance before the analysis takes place. The default is FALSE for consistency with prcomp. Alternatively, a vector of length equal to the number of columns of x can be supplied. The value is passed to scale.

nrestart

the number of random restarts for computing the pseudo-rotation matrix via expectation-maximization (EM) iterations. The solution achieving the minimum of the objective function over all random restarts is kept. A value greater than one can help to avoid poor local minima.

em_tol

If the relative change of the objective is less than em_tol between iterations, the EM procedure is asssumed to have converged to a local optimum.

em_maxiter

the maximum number of EM iterations to be performed. The EM procedure is terminated if either the em_tol or the em_maxiter criterion is satisfied.

verbosity

an integer specifying the verbosity level. Greater values result in more output, the default is to be quiet.

formula

a formula with no response variable, referring only to numeric variables.

data

an optional data frame (or similar: see model.frame) containing the variables in the formula formula. By default the variables are taken from environment(formula).

subset

an optional vector used to select rows (observations) of the data matrix x.

na.action

a function which indicates what should happen when the data contain NAs. The default is set by the na.action setting of options, and is na.fail if that is unset.

Value

nscumcomp returns a list with class (nsprcomp, prcomp) containing the following elements:

sdev

the additional standard deviation explained by each component, see asdev.

rotation

the matrix of non-negative and/or sparse loadings, containing the principal axes as columns.

the scores matrix $\mathbf{XW}$ containing the principal components as columns (after centering and scaling if requested)

center, scale

the centering and scaling used, or FALSE

the deflated data matrix corresponding to x

an orthonormal basis for the principal subspace

The components are returned in order of decreasing variance for convenience.

Details

nscumcomp computes all PCs jointly using expectation-maximization (EM) iterations. The M-step is equivalent to minimizing the objective function

$$\left\Vert \mathbf{X}-\mathbf{Z}\mathbf{W}^{\top}\right\Vert _{F}^{2}+\gamma\left\Vert \mathbf{W}^{\top}\mathbf{W}-\mathbf{I}\right\Vert _{F}^{2}$$

w.r.t. the pseudo-rotation matrix $\mathbf{W}$, where $\mathbf{Z}=\mathbf{X}\mathbf{W}\left(\mathbf{W}^\top\mathbf{W}\right)^{-1}$ is the scores matrix modified to account for the non-orthogonality of $\mathbf{W}$, $\mathbf{I}$ is the identity matrix and gamma is the Lagrange parameter associated with the ortho-normality penalty on $\mathbf{W}$. Non-negativity of the loadings is achieved by enforcing a zero lower bound in the L-BFGS-B algorithm used for the minimization of the objective, and sparsity is achieved by a subsequent soft thresholding of $\mathbf{W}$.

Examples

Run this code

# NOT RUN {
if (requireNamespace("MASS", quietly = TRUE)) withAutoprint({

  set.seed(1)

  # Regular PCA, with tolerance set to return five PCs
  pca <- prcomp(MASS::Boston, tol = 0.35, scale. = TRUE)
  cumsum(pca$sdev[1:5])

  # Sparse cumulative PCA with five components and a total of 20 non-zero loadings.
  # The orthonormality penalty is set to a value which avoids co-linear principal
  # axes. Note that the non-zero loadings are not distributed uniformly over
  # the components.
  scc <- nscumcomp(MASS::Boston, ncomp = 5, k = 20, gamma = 1e4, scale. = TRUE)
  cumsum(scc$sdev)
  cardinality(scc$rotation)

  # Non-negative sparse cumulative PCA
  nscumcomp(MASS::Boston, ncomp = 5, nneg = TRUE, k = 20, gamma = 1e4, scale. = TRUE)
})
# }

Run the code above in your browser using DataLab