ccaGrid: (Robust) CCA via alternating series of grid searches

Description

Perform canoncial correlation analysis via projection pursuit based on alternating series of grid searches in two-dimensional subspaces of each data set, with a focus on robust and nonparametric methods.

Usage

ccaGrid(x, y, k = 1,
    method = c("spearman", "kendall", "quadrant", "M", "pearson"),
    control = list(...), nIterations = 10, nAlternate = 10,
    nGrid = 25, select = NULL, tol = 1e-06, seed = NULL,
    ...)
  CCAgrid(x, y, k = 1,
    method = c("spearman", "kendall", "quadrant", "M", "pearson"),
    maxiter = 10, maxalter = 10, splitcircle = 25,
    select = NULL, zero.tol = 1e-06, seed = NULL, ...)

Arguments

x,y

each can be a numeric vector, matrix or data frame.

an integer giving the number of canonical variables to compute.

method

a character string specifying the correlation functional to maximize. Possible values are "spearman" for the Spearman correlation, "kendall" for the Kendall correlation, "quadrant" for the quadrant correlati

control

a list of additional arguments to be passed to the specified correlation functional. If supplied, this takes precedence over additional arguments supplied via the ... argument.

nIterations,maxiter

an integer giving the maximum number of iterations.

nAlternate,maxalter

an integer giving the maximum number of alternate series of grid searches in each iteration.

nGrid,splitcircle

an integer giving the number of equally spaced grid points on the unit circle to use in each grid search.

select

optional; either an integer vector of length two or a list containing two index vectors. In the first case, the first integer gives the number of variables of x to be randomly selected for determining the order of the variables of

tol,zero.tol

a small positive numeric value to be used for determining convergence.

seed

optional initial seed for the random number generator (see .Random.seed). This is only used if select specifies the numbers of variables of each data set to be randomly selected for

...

additional arguments to be passed to the specified correlation functional.

Value

An object of class "cca" with the following components:
cora numeric vector giving the canonical correlation measures.
Aa numeric matrix in which the columns contain the canonical vectors for x.
Ba numeric matrix in which the columns contain the canonical vectors for y.
callthe matched function call.

Details

The algorithm is based on alternating series of grid searches in two-dimensional subspaces of each data set. In each grid search, nGrid grid points on the unit circle in the corresponding plane are obtained, and the directions from the center to each of the grid points are examined. In the first iteration, equispaced grid points in the interval $[-\pi/2, \pi/2)$ are used. In each subsequent iteration, the angles are halved such that the interval $[-\pi/4, \pi/4)$ is used in the second iteration and so on. If only one data set is multivariate, the algorithm simplifies to iterative grid searches in two-dimensional subspaces of the corresponding data set.

In the basic algorithm, the order of the variables in a series of grid searches for each of the data sets is determined by the average absolute correlations with the variables of the respective other data set. Since this requires to compute the full $(p \times q)$ matrix of absolute correlations, where $p$ denotes the number of variables of x and $q$ the number of variables of y, a faster modification is available as well. In this modification, the average absolute correlations are computed over only a subset of the variables of the respective other data set. It is thereby possible to use randomly selected subsets of variables, or to specify the subsets of variables directly.

Examples

Run this code

## generate data
library("mvtnorm")
set.seed(1234)  # for reproducibility
p <- 3
q <- 2
m <- p + q
sigma <- 0.5^t(sapply(1:m, function(i, j) abs(i-j), 1:m))
xy <- rmvnorm(100, sigma=sigma)
x <- xy[, 1:p]
y <- xy[, (p+1):m]

## Spearman correlation
ccaGrid(x, y, method = "spearman")
ccaGrid(x, y, method = "spearman", consistent = TRUE)

## Pearson correlation
ccaGrid(x, y, method = "pearson")