fpc (version 2.2-9)

awcoord: Asymmetric weighted discriminant coordinates

Description

Asymmetric weighted discriminant coordinates as defined in Hennig (2003). Asymmetric discriminant projection means that there are two classes, one of which is treated as the homogeneous class (i.e., it should appear homogeneous and separated in the resulting projection) while the other may be heterogeneous. The principle is to maximize the ratio between the projection of a between classes separation matrix and the projection of the covariance matrix within the homogeneous class. Points are weighted according to their (robust) Mahalanobis distance to the homogeneous class.

Usage

awcoord(xd, clvecd, clnum=1, mahal="square", method="classical",
                     clweight=switch(method,classical=FALSE,TRUE),
                     alpha=0.99, subsample=0, countmode=1000, ...)

Arguments

xd

the data matrix; a numerical object which can be coerced to a matrix.

clvecd

integer vector of class numbers; length must equal nrow(xd).

clnum

integer. Number of the homogeneous class.

mahal

"md" or "square". If "md", the points are weighted by the square root of the alpha-quantile of the corresponding chi squared distribution over the roots of their Mahalanobis distance to the homogeneous class, unless this is smaller than 1. If "square" (which is recommended), the (originally squared) Mahalanobis distance and the unrooted quantile is used.

method

one of "mve", "mcd" or "classical". Covariance matrix used within the homogeneous class and for the computation of the Mahalanobis distances. "mcd" and "mve" are robust covariance matrices as implemented in cov.rob. "classical" refers to the classical covariance matrix.

clweight

logical. If FALSE, only the points of the heterogeneous class are weighted. This, together with method="classical", computes AWC as defined in Hennig (2003). If TRUE, all points are weighted. This, together with method="mcd", computes ARC as defined in Hennig (2003).

alpha

numeric between 0 and 1. The corresponding quantile of the chi squared distribution is used for the downweighting of points. Points with a smaller Mahalanobis distance to the homogeneous class get full weight.

subsample

integer. If 0, all points are used. Else, only a subsample of subsample of the points is used.

countmode

optional positive integer. Every countmode algorithm runs awcoord shows a message.

...

no effect

Value

List with the following components

ev

eigenvalues in descending order.

units

columns are coordinates of projection basis vectors. New points x can be projected onto the projection basis vectors by x %*% units

proj

projections of xd onto units.

Details

The square root of the homogeneous classes covariance matrix is inverted by use of tdecomp, which can be expected to give reasonable results for singular within-class covariance matrices.

References

Hennig, C. (2004) Asymmetric linear dimension reduction for classification. Journal of Computational and Graphical Statistics 13, 930-945 .

Hennig, C. (2005) A method for visual cluster validation. In: Weihs, C. and Gaul, W. (eds.): Classification - The Ubiquitous Challenge. Springer, Heidelberg 2005, 153-160.

See Also

plotcluster for straight forward discriminant plots. discrproj for alternatives. rFace for generation of the example data used below.

Examples

# NOT RUN {
  set.seed(4634)
  face <- rFace(600,dMoNo=2,dNoEy=0)
  grface <- as.integer(attr(face,"grouping"))
  awcf <- awcoord(face,grface==1)
  # awcf2 <- ancoord(face,grface==1, method="mcd")
  plot(awcf$proj,col=1+(grface==1))
  # plot(awcf2$proj,col=1+(grface==1))
  # ...done in one step by function plotcluster.
# }