do.kpca: Kernel Principal Component Analysis

Description

Kernel principal component analysis (KPCA/Kernel PCA) is a nonlinear extension of classical PCA using techniques called kernel trick, a common method of introducing nonlinearity by transforming, usually, covariance structure or other gram-type estimate to make it flexible in Reproducing Kernel Hilbert Space.

Usage

do.kpca(X, ndim = 2, preprocess = c("null", "center", "scale", "cscale",
  "whiten", "decorrelate"), kernel = c("gaussian", 1))

Arguments

an \((n\times p)\) matrix or data frame whose rows are observations and columns represent independent variables.

ndim

an integer-valued target dimension.

preprocess

an additional option for preprocessing the data. Default is "null". See also aux.preprocess for more details.

kernel

a vector containing name of a kernel and corresponding parameters. See also aux.kernelcov for complete description of Kernel Trick.

Value

a named list containing

Y: an \((n\times ndim)\) matrix whose rows are embedded observations.
trfinfo: a list containing information for out-of-sample prediction.
vars: variances of projected data / eigenvalues from kernelized covariance matrix.

References

goos_kernel_1997Rdimtools

Examples

Run this code

# NOT RUN {
## generate ribbon-shaped data
X = aux.gensamples(dname="ribbon",n=123)

## 1. standard KPCA with gaussian kernel
output1 <- do.kpca(X,ndim=2)

## 2. gaussian kernel with large bandwidth
output2 <- do.kpca(X,ndim=2,kernel=c("gaussian",5))

## 3. use laplacian kernel
output3 <- do.kpca(X,ndim=2,kernel=c("laplacian",1))

## Visualize three different projections
par(mfrow=c(1,3))
plot(output1$Y[,1],output1$Y[,2],main="Gaussian kernel")
plot(output2$Y[,1],output2$Y[,2],main="Gaussian kernel with sigma=5")
plot(output3$Y[,1],output3$Y[,2],main="Laplacian kernel")
# }
# NOT RUN {
# }