ccipca: Candid Covariance-Free Incremental PCA

Description

Stochastic gradient ascent algorithm CCIPCA of Weng et al. (2003).

Usage

ccipca(lambda, U, x, n, q = length(lambda), l=2, center, tol = 1e-8, sort = TRUE)

Value

A list with components

values: updated eigenvalues.
vectors: updated eigenvectors.

Arguments

lambda: vector of eigenvalues.
U: matrix of eigenvectors (PC) stored in columns.
x: new data vector.
n: sample size before observing x.
q: number of eigenvectors to compute.
l: 'amnesic' parameter.
center: optional centering vector for x.
tol: numerical tolerance.
sort: Should the new eigenpairs be sorted?

Details

The 'amnesic' parameter l determines the weight of past observations in the PCA update. If l=0, all observations have equal weight, which is appropriate for stationary processes. Otherwise, typical values of l range between 2 and 4. As l increases, more weight is placed on new observations and less on older ones. For meaningful results, the condition 0<=l<n should hold.
The CCIPCA algorithm iteratively updates the PCs while deflating x. If at some point the Euclidean norm of x becomes less than tol, the algorithm stops to prevent numerical overflow.
If sort is TRUE, the updated eigenpairs are sorted by decreasing eigenvalue. If FALSE, they are not sorted.

References

Weng et al. (2003). Candid Covariance-free Incremental Principal Component Analysis. IEEE Trans. Pattern Analysis and Machine Intelligence.

Examples

Run this code

## Simulation of Brownian motion
n <- 100 # number of paths
d <- 50	 # number of observation points
q <- 10	 # number of PCs to compute
x <- matrix(rnorm(n*d,sd=1/sqrt(d)), n, d)
x <- t(apply(x,1,cumsum))	

## Initial PCA
n0 <- 50
pca <- princomp(x[1:n0,])
xbar <- pca$center
pca <- list(values=pca$sdev^2, vectors=pca$loadings)

## Incremental PCA
for (i in n0:(n-1))
{	xbar <- updateMean(xbar, x[i+1,], i)
  	pca <- ccipca(pca$values, pca$vectors, x[i+1,], i, q = q, center = xbar) }

# Uncentered PCA
nx1 <- sqrt(sum(x[1,]^2))
pca <- list(values=nx1^2, vectors=as.matrix(x[1,]/nx1))
for (i in n0:(n-1))
  	pca <- ccipca(pca$values, pca$vectors, x[i+1,], i, q = q)

Run the code above in your browser using DataLab