Learn R Programming

rrcov (version 0.4-01)

covMest: Constrained M-Estimates of Location and Scatter

Description

Computes constrained M-Estimates of multivariate location and scatter based on the translated biweight function (t-biweight) using a High breakdown point initial estimate (Minimum Covariance Determinant - Fast MCD)

Usage

covMest(x, cor=FALSE, r = 0.45, arp = 0.05, eps=1e-3, 
    maxiter=120, control, t0, S0)

Arguments

x
a matrix or data frame.
cor
should the returned result include a correlation matrix? Default is cor = FALSE
r
required breakdown point. Allowed values are between (n - p)/(2 * n) and 1 and the default is 0.45
arp
asympthotic rejection point, i.e. the fraction of points receiving zero weight (see Rocke (1996)). Default is 0.05.
eps
a numeric value specifying the relative precision of the solution of the M-estimate. Defaults to 1e-3
maxiter
maximum number of iterations allowed in the computation of the M-estimate. Defaults to 120
control
a list with estimation options - same as these provided in the fucntion specification. If the control object is supplied, the parameters from it will be used. If parameters are passed also in the invocation statement, they will override the
t0
optional initial high breakdown point estimates of the location. If not supplied MCD will be used.
S0
optional initial high breakdown point estimates of the scatter. If not supplied MCD will be used.

Value

  • An object of class "mest" which is basically a list with the following components. This class is "derived" from "mcd" so that the same generic functions - print, plot, summary - can be used. NOTE: this is going to change - in one of the next revisions covMest will return an S4 class "mest" which is derived (i.e. contains) form class "cov".
  • centerthe final estimate of location.
  • covthe final estimate of scatter.
  • corthe estimate of the correlation matrix (only if cor = TRUE).
  • mahmahalanobis distances of the observations using the M-estimate of the location and scatter.
  • Xthe input data as a matrix.
  • n.obstotal number of observations.
  • methodcharacter string naming the method (M-Estimates).
  • callthe call used (see match.call).

concept

High breakdown point

Details

Rocke (1996) has shown that the S-estimates of multivariate location and scatter in high dimensions can be sensitive to outliers even if the breakdown point is set to be near 0.5. To mitigate this problem he proposed to utilize the translated biweight (or t-biweight) method with a standardization step consisting of equating the median of rho(d) with the median under normality. This is then not an S-estimate, but is instead a constrained M-estimate. In order to make the smooth estimators to work, a reasonable starting point is necessary, which will lead reliably to a good solution of the estimator. In covMest the MCD computed by covMcd is used, but the user has the possibility to give her own initial estimates.

References

D.L.Woodruff and D.M.Rocke (1994) Computable robust estimation of multivariate location and shape on high dimension using compound estimators, Journal of the American Statistical Association, 89, 888--896. D.M.Rocke (1996) Robustness properties of S-estimates of multivariate location and shape in high dimension, Annals of Statistics, 24, 1327-1345. D.M.Rocke and D.L.Woodruff (1996) Identification of outliers in multivariate data Journal of the American Statistical Association, 91, 1047--1061.

See Also

covMcd

Examples

Run this code
data(hbk)
hbk.x <- data.matrix(hbk[, 1:3])
covMest(hbk.x)

## the following three statements are equivalent
c0 <- covMest(hbk.x)
c1 <- covMest(hbk.x, r = 0.45)
c2 <- covMest(hbk.x, control = rrcov.control(r = 0.45))
## direct specification overrides control one:
c3 <- covMest(hbk.x, r = 0.45,
             control = rrcov.control(r=0.25))
c1

Run the code above in your browser using DataLab