Learn R Programming

concor (version 1.0-0.1)

concoreg: Redundancy of sets yj by one set x

Description

Regression of several subsets of variables Yj by another set X. SUCCESSIVE SOLUTIONS

Usage

concoreg(x,y,py,r)

Arguments

x
is a $n \times p$ matrix of p centered explanatory variables
y
is a $n \times q$ matrix of q centered variables
py
is a row vector which contains the numbers $q_i, i=1,...,ky$, of the ky subsets $y_i$ of y : $\sum_i q_i$ = sum(py) = q. py is the partition vector of y
r
is the wanted number of successive solutions

Value

  • list with following components
  • cxthe $n \times r$ matrix of the r explanatory components
  • vis a $q \times r$ matrix of ky row blocks $v_i$ ($q_i \times r$) of axes in Rqi relative to yi; $v_i'*v_i = \mbox{Id}$
  • Vis a $q \times r$ matrix of axes in Rq relative to y; $V'*V = \mbox{Id}$
  • varexpis a $ky \times r$ matrix; each column k contains ky explained variances $\rho(cx[,k],y_i*v_i[,k])^2 \mbox{var}(y_i*v_i[,k])$

Details

The first solution calculates 1+ky normed vectors: the component cx[,1] in $R^n$ associated to the ky vectors vi[,1]'s of $R^{q_i}$, by maximizing $varexp1=\sum_i \rho(cx[,1],y_i*v_i[,1])^2 \mbox{var}(y_i*v_i[,1]))$, with $1+ky$ norm constraints. A explanatory component cx[,k] is associated to ky partial explained components yi*vi[,k] and also to a global explained component y*V[,k]. $\rho(cx[,k],y*V[,k])^2 \mbox{var}(y*V[,k])= \mbox{varexpk}$. The total explained variance by the first solution is maximal.

The second solution is obtained from the same criterion, but after replacing each yi by $y_i-y_i*v_i[,1]*v_i[,1]'$. And so on for the successive solutions 1,2,...,r . The biggest number of solutions may be $r=inf(n,p,q_i)$, when the matrices x'*yi are supposed with full rank. For a set of r solutions, the matrix (cx)'*y*V is diagonal : "on average", the explanatory component of one solution is only linked with the components explained by this explanatory, and is not linked with the explained components of the other solutions. The matrices $(cx)'*y_j*v_j$ are triangular : the explanatory component of one solution is not linked with each of the partial components explained in the following solutions. The definition of the explanatory components depends on the partition vector py from the second solution.

This function is using concor function

References

Hanafi & Lafosse (2001) Generalisation de la regression lineaire simple pour analyser la dependance de K ensembles de variables avec un K+1 eme. Revue de Statistique Appliquee vol.49, n.1.

Chessel D. & Hanafi M. (1996) Analyses de la Co-inertie de K nuages de points. Revue de Statistique Appliquee vol.44, n.2. (this ACOM analysis of one multiset is obtained by the command : concoreg(Y,Y,py,r))

Examples

Run this code
x<-matrix(runif(50),10,5);y<-matrix(runif(90),10,9)
x<-scale(x);y<-scale(y)
co<-concoreg(x,y,c(3,2,4),2)
((t(co$cx[,1])%*%y[,1:3]%*%co$v[1:3,1])/10)^2;co$varexp[1,1]
t(co$cx)%*%co$cx /10
diag(t(co$cx)%*%y%*%co$V/10)^2
sum(co$varexp[,1]);sum(co$varexp[,2])

Run the code above in your browser using DataLab