Learn R Programming

symMCD (version 0.6)

MCDzero: MCD with respect to the Origin

Description

Computes the minimum covariance determinant (MCD) estimator with the location fixed at the origin, using either a C++ or a plain R implementation.

Usage

MCDzero(X, alpha = 0.5, ns = 500, nc = 10, delta = 0.01)
MCDzeroR(y, alpha = 0.5, ns = 500, nc = 10, delta = 0.01)

Value

For the C++ implementation, a list with the following components:

SigmaRaw

The raw MCD scatter matrix with the location fixed at the origin.

SigmaRe

The reweighted MCD scatter matrix with the location fixed at the origin.

IndexOptimal

An integer vector giving the indices of the observations forming the optimal subset.

For the R implementation, a list with the following components:

sigma

The raw MCD scatter matrix with the location fixed at the origin.

sigmaR

The reweighted MCD scatter matrix with the location fixed at the origin.

Arguments

y

A numeric data matrix with observations in rows and variables in columns.

X

A numeric data matrix with observations in rows and variables in columns.

alpha

A numeric value in \((0, 1]\) specifying the fraction of observations to be retained for the MCD computation. Usually in the interval (0.5,1).

ns

An integer specifying the number of random initial subsets used by the algorithm.

nc

An integer specifying the number of concentration steps performed for each initial subset.

delta

A numeric tuning constant used in the reweighting step.

Details

The minimum covariance determinant (MCD) estimator is computed using an iterative algorithm based on random initial subsets, followed by a fixed number of concentration steps. In contrast to the classical MCD, the location is fixed at the origin and is not estimated from the data.

The parameter alpha controls the size of the subset retained at each step and thus determines the robustness of the estimator. For each of the ns random initial subsets, the algorithm applies nc concentration steps to improve the determinant of the scatter estimate. The best solution over all starts is retained.

An optional reweighting step is applied using the tuning constant delta, which aims to improve efficiency by downweighting observations with large squared Mahalanobis distances.

Examples

Run this code
X <- matrix(rnorm(300), ncol=3)
colnames(X) <- LETTERS[1:3]

# settings seeds for comparison
set.seed(1)
mcd0cpp <- MCDzero(X, 0.5)
set.seed(1)
mcd0r <- MCDzeroR(X, 0.5)

mcd0cpp$SigmaRaw  
mcd0cpp$SigmaRaw - mcd0r$sigma  

mcd0cpp$SigmaRe  
mcd0cpp$SigmaRe - mcd0r$sigmaR 

Run the code above in your browser using DataLab