Learn R Programming

hsicCCA (version 1.0)

hsicCCA: Canonical Correlation Analysis based on the Hilbert-Schmidt Independence Criterion.

Description

Given two multi-dimensional data sets, find pairs of canonical projection pairs that maximize the HSIC criterion.

Usage

hsicCCA(x, y, M, sigmax = NULL, sigmay = NULL, numrepeat = 5, numiter = 100, reltolstop = 1e-04)

Arguments

x
The x-variable data matrix. One row per observation.
y
The y-variable data matrix. One row per observation.
M
Number of canonical projection pairs to extract.
sigmax
The bandwidth parameter for the Gaussian kernel on the x-variable set. A positive value. The smaller the smoother. If NULL, set to median(dist(x)), and will be updated automatically for extracting different pairs of canonical projection.
sigmay
The bandwidth parameter for the Gaussian kernel on the y-variable set. A positive value. The smaller the smoother. If NULL, set to median(dist(y)), and will be updated automatically for extracting different pairs of canonical projection.
numrepeat
Number of random restarts.
numiter
Maximum number of iterations for extracting each pair of canonical projections.
reltolstop
Convergence threshold. Algorithm stops when relative change in cost from consecutive iterations is less than the threshold and will then move on to find the next pair of canonical vectors.

Value

  • A list containing:
  • WxThe M canoncial projection vectors for the x-variable set. Each column corresponds to a projection vector.
  • WyThe M canoncial projection vectors for the y-variable set. Each column corresponds to a projection vector.

Details

Optimization is done by gradient descent, where Nelder-Mead is used for step-size selection. Nelder Mead may fail to increase the cost at times (when stuck at local minima). User may consider restarting the algorithm when this happens.

References

Chang et. al. (2013) Canonical Correlation Analysis based on Hilbert-Schmidt Independence Criterion and Centered Kernel Target Alignment. ICML 2013.

Gretton et. al. (2005) Measuring statistical dependence with Hilbert-Schmidt Norm. In Algorithmic Learning Theory 2005.

See Also

ktaCCA, hsicCCAfunc

Examples

Run this code
set.seed(1)
numData <- 100
numDim <- 3
x <- matrix(rnorm(numData*numDim),numData,numDim)
y <- matrix(rnorm(numData*numDim),numData,numDim)
z <- runif(numData,-pi,pi)
y[,1] <- cos(z)+rnorm(numData,sd=0.1); x[,1] <- sin(z)+rnorm(numData,sd=0.1)
y[,2] <- x[,2]+rnorm(numData,sd=0.5)
x <- scale(x)
y <- scale(y)

fit <- hsicCCA(x,y,2,numrepeat=2,numiter=10)
par(mfrow=c(1,2))
for (K in 1:2) plot(x%*%fit$Wx[,K],y%*%fit$Wy[,K])

Run the code above in your browser using DataLab