Computes the RM coefficient that measures the similarity of the
spectral decompositions of a p-variable data matrix, and of the matrix which
results from regressing those variables on a subset (given by "indices") of
the variables. Input data is expected in the form of a (co)variance or
correlation matrix. If a non-square matrix is given, it is assumed to
be a data matrix, and its correlation matrix is used as input.The definition of the RM coefficient is as follows:
$$RM = \sqrt{\frac{\mathrm{tr}(X^t P_v X)}{\mathrm{X^t X}}} $$
where $X$ is the full
(column-centered) data matrix and $Pv$ is the matrix of
orthogonal projections on the subspace spanned by a k-variable subset.
This definition is equivalent to:
$$RM = \sqrt{\frac{\sum\limits_{i=1}^{p}\lambda_i
(r)_i^2}{\sum\limits_{j=1}^{p}\lambda_j}} $$
where $lambda_i$ stands for the $i$-th largest
eigenvalue of the covariance matrix defined by X and
$r$ stands for the multiple correlation between the
i
-th Principal Component and the k-variable subset.
These definitions are also equivalent to the expression used in the
code, which only requires the covariance (or correlation) matrix of
the data under consideration.
The fact that indices
can be a matrix or 3-d array allows for
the computation of the RM values of subsets produced by the search
functions anneal
, genetic
and
improve
(whose output option $subsets
are
matrices or 3-d arrays), using a different criterion (see the example
below).