Learn R Programming

PTAk (version 1.2-6)

CauRuimet: Robust estimation of within group varinace-covariance

Description

Gives a robust estimate of an unknown within group covariance, aiming either to look for dense groups or to sparse groups (outliers) according to local variance and weighting function choice.

Usage

CauRuimet(Z,ker=1,m0=1,withingroup=TRUE,
              loc=substitute(apply(Z,2,mean,trim=.1)),matrixmethod=TRUE, Nrandom=3000)

Arguments

Z
matrix
ker
either numerical or a function: if numerical the weighting function is $e^{(-ker \;t)}$, otherwise ker=function(t){return(expression)} is a positive decreasing function.
m0
is a graph of neighbourhood or another proximity matrix, the hadamard product of the proximities will be operated
withingroup
logical,if TRUE the aim is to give a robust estimate for dense groups, if FALSE the aim is to give a robust estimate for outliers
loc
a vector of locations or a function using mean, median, to give an estimate of it
matrixmethod
if TRUE (only with withingroup) uses some matrix computation rather than double looping as suggests the formula below
Nrandom
if Nrandom < dim(Z)[1]) uses only a Nrandom sample from rows of Z and m0 if applicable.

Value

  • a matrix

Details

When withingroup is TRUE, local(defined by the weighting) variance formula is returned, aiming at finding dense groups: $$W_l=\frac{\sum_{i=1}^{n-1}\sum_{j=i+1}^n m0_{ij}ker(d^2_{S^-}(Z_i,Z_j))(Z_i-Z_j)'(Z_i-Z_j)}{\sum_{i=1}^{n-1}\sum_{j=i+1}^n m0_{ij}ker(d^2_{S^-}(Z_i,Z_j))}$$ where $d^2_{S^-}( . , .)$ is the squared euclidian distance with $S^-$ the inverse of a robust sample covariance (i.e. using loc instead of the mean) ; if FALSE robust Total weighted variance or if m0 not 1 Global weighted variance, is returned: $$W_o=\frac{\sum_{i=1}^nker(d^2_{S^-}(Z_i,\tilde{Z}))(Z_i-\tilde{Z})'(Z_i-\tilde{Z})} {\sum_{i=1}^n ker(d^2_{S^-}(Z_i,\tilde{Z}))}$$

$$W_g=\frac{\sum_{i=1}^{n-1}\sum_{j=i+1}^n m0_{ij}.ker(d^2_{S^-}(Z_i,Z_j))(Z_i-\tilde{Z})'(Z_j-\tilde{Z})} {\sum_{i=1}^{n-1}\sum_{j=i+1}^n m0_{ij}ker(d^2_{S^-}(Z_i,Z_j))}$$

where $\tilde{Z}$ is the vector loc. If m0 is a graph of neighbourhood and ker is the function returning 1 (no proximity due to distance is used) the function will return (when withingroup=TRUE) the local variance-covariance matrix as define in Lebart(1969).

References

Caussinus, H and Ruiz, A (1990) Interesting Projections of Multidimensional Data by Means of Generalized Principal Components Analysis. COMPSTAT90, Physica-Verlag, Heidelberg,121-126.

Faraj, A (1994) Interpretation tools for Generalized Discriminant Analysis.In: New Approches in Classification and Data Analysis, Springer-Verlag, 286-291, Heidelberg.

Lebart, L (1969) Analyse statistique de la contiguite.Publication de l'Institut de Statistiques Universitaire de Paris, XVIII,81-112.

Leibovici D (2008) Spatio-temporal Multiway Decomposition using Principal Tensor Analysis on k-modes: the R package PTAk . to be submitted soon at Journal of Statisticcal Software.

See Also

SVDgen

Examples

Run this code
data(iris)
  iris2 <- as.matrix(iris[,1:4])
  dimnames(iris2)[[1]] <- as.character(iris[,5])

 D2 <- CauRuimet(iris2,ker=1,withingroup=TRUE)
 D2 <- Powmat(D2,(-1))
 iris2 <- sweep(iris2,2,apply(iris2,2,mean))
 res <- SVDgen(iris2,D2=D2,D1=1)
 plot(res,nb1=1,nb2=2,cex=0.5)
 summary(res,testvar=0)

 # the same in a demo function
  # source(paste(R.home(),"/library/PTAk/demo/CauRuimet.R",sep=""))
 # demo.CauRuimet(ker=4,withingroup=TRUE,openX11s=FALSE)
 # demo.Cauruimet(ker=0.15,withingroup=FALSE,openX11s=FALSE)

Run the code above in your browser using DataLab