Compute a multivariate location and scale estimate with a high
breakdown point -- this can be thought of as estimating the mean and
covariance of the `good`

part of the data. `cov.mve`

and
`cov.mcd`

are compatibility wrappers.

```
cov.rob(x, cor = FALSE, quantile.used = floor((n + p + 1)/2),
method = c("mve", "mcd", "classical"),
nsamp = "best", seed)
```cov.mve(...)
cov.mcd(...)

A list with components

- center
the final estimate of location.

- cov
the final estimate of scatter.

- cor
(only is

`cor = TRUE`

) the estimate of the correlation matrix.- sing
message giving number of singular samples out of total

- crit
the value of the criterion on log scale. For MCD this is the determinant, and for MVE it is proportional to the volume.

- best
the subset used. For MVE the best sample, for MCD the best set of size

`quantile.used`

.- n.obs
total number of observations.

- x
a matrix or data frame.

- cor
should the returned result include a correlation matrix?

- quantile.used
the minimum number of the data points regarded as

`good`

points.- method
the method to be used -- minimum volume ellipsoid, minimum covariance determinant or classical product-moment. Using

`cov.mve`

or`cov.mcd`

forces`mve`

or`mcd`

respectively.- nsamp
the number of samples or

`"best"`

or`"exact"`

or`"sample"`

. The limit If`"sample"`

the number chosen is`min(5*p, 3000)`

, taken from Rousseeuw and Hubert (1997). If`"best"`

exhaustive enumeration is done up to 5000 samples: if`"exact"`

exhaustive enumeration will be attempted.- seed
the seed to be used for random sampling: see

`RNGkind`

. The current value of`.Random.seed`

will be preserved if it is set.- ...
arguments to

`cov.rob`

other than`method`

.

For method `"mve"`

, an approximate search is made of a subset of
size `quantile.used`

with an enclosing ellipsoid of smallest volume; in
method `"mcd"`

it is the volume of the Gaussian confidence
ellipsoid, equivalently the determinant of the classical covariance
matrix, that is minimized. The mean of the subset provides a first
estimate of the location, and the rescaled covariance matrix a first
estimate of scatter. The Mahalanobis distances of all the points from
the location estimate for this covariance matrix are calculated, and
those points within the 97.5% point under Gaussian assumptions are
declared to be `good`

. The final estimates are the mean and rescaled
covariance of the `good`

points.

The rescaling is by the appropriate percentile under Gaussian data; in
addition the first covariance matrix has an *ad hoc* finite-sample
correction given by Marazzi.

For method `"mve"`

the search is made over ellipsoids determined
by the covariance matrix of `p`

of the data points. For method
`"mcd"`

an additional improvement step suggested by Rousseeuw and
van Driessen (1999) is used, in which once a subset of size
`quantile.used`

is selected, an ellipsoid based on its covariance
is tested (as this will have no larger a determinant, and may be smaller).

There is a hard limit on the allowed number of samples, \(2^{31} -
1\). However, practical limits are likely to be much lower
and one might check the number of samples used for exhaustive
enumeration, `combn(NROW(x), NCOL(x) + 1)`

, before attempting it.

P. J. Rousseeuw and A. M. Leroy (1987)
*Robust Regression and Outlier Detection.*
Wiley.

A. Marazzi (1993)
*Algorithms, Routines and S Functions for Robust Statistics.*
Wadsworth and Brooks/Cole.

P. J. Rousseeuw and B. C. van Zomeren (1990) Unmasking
multivariate outliers and leverage points,
*Journal of the American Statistical Association*, **85**, 633--639.

P. J. Rousseeuw and K. van Driessen (1999) A fast algorithm for the
minimum covariance determinant estimator. *Technometrics*
**41**, 212--223.

P. Rousseeuw and M. Hubert (1997) Recent developments in PROGRESS. In
*L1-Statistical Procedures and Related Topics *
ed Y. Dodge, IMS Lecture Notes volume **31**, pp. 201--214.

`lqs`

```
set.seed(123)
cov.rob(stackloss)
cov.rob(stack.x, method = "mcd", nsamp = "exact")
```

Run the code above in your browser using DataCamp Workspace