Learn R Programming

subselect (version 0.1-1)

full.k.search: VALUES OF THE RM, GCD AND RV COEFFICIENTS FOR ALL k-VARIABLE SUBSETS OF A DATA SET

Description

Computes the values of the GCD, RV and RM coefficients for all k-variable subsets of a given data set. Outputs the optimal values and subsets, for each of those coefficients, and, optionally, the full results.

Usage

full.k.search(mat, k, print.all = FALSE, file="")

Arguments

mat
the full data set's covariance (or correlation) matrix.
k
the cardinality of the variable subsets that are wanted.
print.all
if TRUE, prints out the values of the three coefficients for all k-variable subsets.
file
the file where the complete results for all k-variable subsets will be written, if print.all=TRUE.

Value

  • A list with the following items:
  • rmmaxThe maximum value of the RM coefficient;
  • rmindThe indices of the k variables in the optimal subset for the RM coefficient;
  • gcdmaxThe maximum value of the GCD coefficient;
  • gcdindThe indices of the k variables in the optimal subset for the GCD coefficient;
  • rvmaxThe maximum value of the RV coefficient;
  • rvindThe indices of the k variables in the optimal subset for the RV coefficient.
  • If print.all=TRUE, the full results for all k-variable subsets will be written to file, one subset per line.

Warning

This function is unusable even for moderatly large data sets. For such data sets, consider using the SSCMA software, written and presented by Pedro Duarte Silva in Discarding Variables in Principal Component Analysis: Algorithms for all-subset comparisons, WP-00-002, July 2000, Universidade Cat�lica Portuguesa, psilva@porto.ucp.pt.

Details

Generates all k-variable subsets of the p-variable data set defined by mat. For each subset, computes the values of the GCD, RM and RV coefficients (comparing with the first k Principal Components when computing gcd.coef). When print.all=FALSE, only the optimal values and subsets, for each coefficient, are produced in standard output. If print.all=TRUE, the full results are written in the file specified in file.

References

Cadima, J. and Jolliffe, I.T. (2001), "Variable Selection and the Interpretation of Principal Subspaces", Journal of Agricultural, Biological and Environmental Statistics, Vol. 6, 62-79.

See Also

gcd.coef,rm.coef,rv.coef

Examples

Run this code
data(iris3) 
x<-iris3[,,1]
full.k.search(cor(x),k=2)
## $rmmax
## [1] 0.8233218
## 
## $rmindices
## [1] 2 3
##
## $gcdmax
## [1] 0.7821987
##
## $gcdindices
## [1] 2 3
##
## $rvmax
## [1] 0.8453146
##
## $rvindices
## [1] 1 4

Run the code above in your browser using DataLab