mrfDepth (version 1.0.12)

hdepthmedian: Location estimates based on halfspace depth.

Description

Computes the halfspace median and its corresponding halfspace depth for a \(p\)-dimensional data set x. Computation is exact for \(p \le 2\) and approximate for \(p > 2\).

Usage

hdepthmedian(x, maxdir = NULL)

Arguments

x

An \(n\) by \(p\) data matrix.

maxdir

The number of projections used in the approximate algorithm. Defaults to \(250p\).

Value

A list containing:

median

The coordinates of the halfspace median. Approximate when \(p>2\).

depth

The halfspace depth of the halfspace median. Approximate when \(p>2\).

dithered

Logical indicating whether dithering has been applied in the exact algorithm. FALSE indicates no dithering has been applied. TRUE indicates dithering has been applied.

ndir

The number of projections used by the approximate algorithm. Due to the possibility of singularity of certain \(p\) subsamples it is possible that not all maxdir directions are evaluated.

AlgStopFlag

Indicates which stopping rule is used by the approximate algorithm. 0 indicates the maximum number of projections was reached 1 indicates no improvement of the location estimate was made after \(10(p+1)\) steps.

dimension

If the data are lying in a lower dimensional subspace, the dimension of this subspace.

hyperplane

If the data are lying in a lower dimensional subspace, a direction orthogonal to this subspace.

Details

The halfspace median, or Tukey median, is the multivariate point with largest halfspace depth with respect to the data x. This point is not always unique. In that case the halfspace median corresponds to the center of gravity of the convex set of deepest points.

It is first checked whether the data is found to lie in a subspace of dimension lower than \(p\). If so, the routine will give a warning, giving back the dimension of the subspace together with a direction describing a hyperplane containing this subspace.

For bivariate data the exact algorithm of Rousseeuw and Ruts (1998) is applied. When the data are not in general position (i.e. when there is a line containing more than two observations) dithering is performed by adding random Gaussian noise to the data. In this the ouput argument dithered will containg a flag.

When \(p > 2\) the approximate algorithm of Struyf and Rousseeuw (2000) is applied. It is an iterative procedure based on projections. Their number can be chosen by the input parameter maxdir.

References

Rousseeuw P.J., Ruts I. (1998). Constructing the bivariate Tukey median. Statistica Sinica, 8, 827--839.

Struyf A., Rousseeuw P.J. (2000). High-dimensional computation of the deepest location. Computational Statistics & Data Analysis, 34, 415--436.

Examples

Run this code
# NOT RUN {
# Compute a location estimate of a simple 
# two-dimensional dataset.

data(cardata90)
Result <- hdepthmedian(x=cardata90)
# }

Run the code above in your browser using DataLab