fastica: Fast Fixed Point ICA

Description

The fast fixed point algorithm for independent component analysis and projection pursuit based on the direct translation to R of the FastICA program of the original authors at the Helsinki University of Technology.

Usage

fastica(X, approach = c("symmetric", "deflation"), n.comp = dim(X)[2], demean = TRUE, 
pca.cov = c("ML", "LW", "ROB", "EWMA"), gfun = c("pow3", "tanh", "gauss", "skew"), 
finetune = c("none", "pow3", "tanh", "gauss", "skew"), tanh.par = 1, gauss.par = 1, 
step.size = 1, stabilization = FALSE, epsilon = 1e-4, maxiter1 = 1000, maxiter2 = 5, 
A.init = NULL, pct.sample = 1, firstEig = NULL, lastEig = NULL, 
pcaE = NULL, pcaD = NULL, whiteSig = NULL, whiteMat = NULL, dewhiteMat = NULL, 
rseed = NULL, trace = FALSE, ...)

Arguments

The multidimensional signal matrix, where each column of matrix represents one observed signal.

approach

The decorrelation approach to use, with symmetric estimating the components in parallel while deflation estimating one-by-one as in projection pursuit.

n.comp

Number of independent components to estimate, defaults to the dimension of the data (rows). Is overwritten by firstEig and lastEig.

demean

(Logical) Whether the data should be centered.

pca.cov

The method to use for the calculation of the covariance matrix during the PCA whitening phase. ML is the standard maximum likelihood method, LW is the Ledoit and Wolf method, ROB is the robust method

gfun

The nonlinearity algorithm to use in the fixed-point algorithm.

finetune

The nonlinearity algorithm for fine-tuning.

tanh.par

Control parameter used when nonlinearity algorithm equals tanh.

gauss.par

Control parameter used when nonlinearity algorithm equals gauss.

step.size

Step size. If this is anything other than 1, the program will use the stabilized version of the algorithm.

stabilization

Controls whether the program uses the stabilized version of the algorithm. If the stabilization is on, then the value of step.size can momentarily be halved if the program estimates that the algorithm is stuck between two points (this i

epsilon

Stopping criterion. Default is 0.0001.

maxiter1

Maximum number of iterations for gfun algorithm.

maxiter2

Maximum number of iterations for finetune algorithm.

A.init

Initial guess for the mixing matrix A. Defaults to a random (standard normal) filled matrix (no.signals by no.factors).

pct.sample

Percentage [0-1] of samples used in one iteration. Samples are chosen at random.

firstEig

This and lastEig specify the range for eigenvalues that are retained, firstEig is the index of largest eigenvalue to be retained. Making use of this option overwrites n.comp.

lastEig

This is the index of the last (smallest) eigenvalue to be retained and overwrites n.comp argument.

pcaE

Optionally provided eigenvector (must also supply pcaD).

pcaD

Optionally provided eigenvalues (must also supply pcaE).

whiteSig

Optionally provided Whitened signal.

whiteMat

Optionally provided Whitening matrix (no.factors by no.signals).

dewhiteMat

Optionally provided dewhitening matrix (no.signals by no.factors).

rseed

Optionally provided seed to initialize the mixing matrix A (when A.init not provided).

trace

To report progress in the console, set this to TRUE.

...

Optional arguments passed to the pca.cov methods.

Value

A list containing the following values:
AEstimated Mixing Matrix (no.signals by no.factors).
WEstimated UnMixing Matrix (no.factors by no.signals).
UEstimated rotation Matrix (no.factors by no.factors).
SThe column vectors of estimated independent components (no.obs by no.factors).
CEstimated Covariance Matrix (no.signals by no.signals).
whiteningMatrixThe Whitening matrix (no.factors by no.signals).
dewhiteningMatrixThe de-Whitening matrix (no.signals by no.factors).
rseedThe random seed used (if any) for initializing the mixing matrix A.
elapsedThe elapsed time.

Details

The fastica program is a direct translation into R of the FastICA Matlab program of Gaevert, Hurri, Saerelae, and Hyvaerinen with some extra features. All computations are currently implemented in R so for very large dimensional sets alternative implementations may be faster. Porting part of the code to C++ may be implemented in a future version.

References

Hyvaerinen, A. and Oja,.E , 1997, A fast fixed-point algorithm for independent component analysis, Neural Computation, 9(7), 1483-1492. Reprinted in Unsupervised Learning, G. Hinton and T. J. Sejnowski, 1999, MIT Press.

Examples

Run this code

# create a set of independent signals S, glued together by a mixing matrix A
# (note the notation and matrix multiplication direction as we are dealing with
# row rather than column vectors)
set.seed(100)
S <- matrix(runif(10000), 5000, 2)
A <- matrix(c(1, 1, -1, 2), 2, 2, byrow = TRUE)
# the mixed signal X
X = S %*% t(A)
# The function centers and whitens (by the eigenvalue decomposition of the 
# unconditional covariance matrix)  the data before applying the theICA algorithm.
IC <- fastica(X, n.comp = 2, approach = "symmetric", gfun = "tanh", trace  = TRUE, 
A.init = diag(2))

# demeaned data:
X_bar = scale(X, scale = FALSE)

# whitened data:
X_white = X_bar %*% t(IC$whiteningMatrix)

# check whitening:
# check correlations are zero
cor(X_white)
# check diagonals are 1 in covariance
cov(X_white)

# check that the estimated signals(S) multiplied by the
# estimated mxing matrix (A) are the same as the original dataset (X)
round(head(IC$S %*% t(IC$A)), 12) == round(head(X), 12)

# do some plots:
par(mfrow = c(1, 3))
plot(IC$S %*% t(IC$A), main = "Pre-processed data")
plot(X_white, main = "Whitened and Centered components")
plot(IC$S, main = "ICA components")

Run the code above in your browser using DataLab