subselect (version 0.1-1)

rv.coef: COMPUTES THE RV-COEFFICIENT FOR VARIABLE SUBSET SELECTION

Description

Computes the RV coefficient, measuring the similarity (after rotations, translations and global re-sizing) of two configurations of n points given by: (i) observations on each of p variables, and (ii) the regression of those p observed variables on a subset of the variables.

Usage

rv.coef(mat, indices)

Arguments

mat
the full data set's covariance (or correlation) matrix
indices
a numerical vector of indices of the variables in the subset being considered.

Value

  • The value of the RV-coefficient.

detail

Input data is expected in the form of a (co)variance or correlation matrix of the full data set. If a non-square matrix is given, it is assumed to be a data matrix, and its (co)variance matrix is used as input. The subset of variables on which the full data set will be regressed is given by indices.

The RV-coefficient, for a (coumn-centered) data matrix (with p variables/columns) X, and for the regression of these columns on a k-variable subset, is given by: $$RV = \frac{\mathrm{tr}(X X^t \cdot (P_v X)(P_v X)^t)} {\sqrt{\mathrm{tr}((X X^t)^2) \cdot \mathrm{tr}(((P_v X) (P_v X)^t)^2)} }$$ where $P_v$ is the matrix of orthogonal projections on the subspace defined by the k-variable subset.

This definition is equivalent to the expression used in the code, which only requires the covariance (or correlation) matrix of the data under consideration.

References

Robert, P. and Escoufier, Y. (1976), "A Unifying tool for linear multivariate statistical methods: the RV-coefficient", Applied Statistics, Vol.25, No.3, p. 257-265.

Examples

Run this code
data(iris3) 
x<-iris3[,,1]
rv.coef(var(x),c(1,3))
## [1] 0.8659685

Run the code above in your browser using DataLab