mrfDepth (version 1.0.12)

sprojdepth: Skewness-adjusted projection depth of points relative to a dataset

Description

Computes the skewness-adjusted projection depth of \(p\)-dimensional points z relative to a \(p\)-dimensional dataset x.

Usage

sprojdepth(x, z = NULL, options = NULL)

Arguments

x

An \(n\) by \(p\) data matrix with observations in the rows and variables in the columns.

z

An optional \(m\) by \(p\) matrix containing rowwise the points \(z_i\) for which to compute the projection depth. If z is not specified, it is set equal to x.

options

A list of options to pass to the underlying adjOutl routine. See adjOutl for the full list of options.

Value

A list with components:

depthX

Vector of length \(n\) giving the skewness-adjusted projection depth of the observations in x.

depthZ

Vector of length \(m\) giving the skewness-adjusted projection depth of the points in z.

cutoff

Points whose skew-adjusted projection depth is smaller than this cutoff can be considered as outliers.

flagX

Observations of x whose adjusted outlyingness exceeds the cutoff receive a flag FALSE, regular observations receive a flag TRUE.

flagZ

Points of z whose adjusted outlyingness exceeds the cutoff receive a flag equal to FALSE, otherwise they receive a flag TRUE.

singularSubsets

When the input parameter type is equal to "Affine", the number of \(p\)-subsets that span a subspace of dimension smaller than \(p-1\). In that case the orthogonal direction can not be uniquely determined. This is an indication that the data are not in general position. When the input parameter type is equal to "Rotation" it is possible that two randomly selected points of the data coincide due to ties in the data. In this case this value signals how many times this is the case.

dimension

When the data x are lying in a lower dimensional subspace, the dimension of this subspace.

hyperplane

When the data x are lying in a lower dimensional subspace, a direction orthogonal to this subspace. When a direction \(v\) is found such that the robust skew-adjusted scale of \(xv\) is equal to zero, this equals \(v\).

inSubspace

When a direction \(v\) is found such that AO(\(xv\)) is ill-defined, the observations from x which belong to the hyperplane orthogonal to \(v\) receive a value TRUE. The other observations receive a value FALSE.

Details

Skewness-adjusted projection depth is based on the adjusted outlyingness and is computed as \(1/(1+AO)\). As adjusted outlyingness extends the Stahel-Donoho outlyingness towards skewed distributions, the skewness-adjusted projection depth is suited for both elliptical distributions and skewed multivariate data.

It is first checked whether the data is found to lie in a subspace of dimension lower than \(p\). If so, a warning is given, as well as the dimension of the subspace and a direction which is orthogonal to it.

See adjOutl for more details on the computation of the AO. To visualize the depth of bivariate data one can apply the mrainbowplot function. It plots the data colored according to their depth.

The output values of this function are based on the output of the adjOutl function. More details can be found there.

References

Hubert M., Van der Veeken S. (2008). Outlier detection for skewed data. Journal of Chemometrics, 22, 235--246.

Hubert M, Rousseeuw P.J., Segaert P. (2015). Multivariate Functional Outlier Detection. Statistical Methods & Applications, 24, 177--202.

See Also

adjOutl, sprojmedian, mrainbowplot, dirOutl, outlyingness

Examples

Run this code
# NOT RUN {
# Compute the skewness-adjusted projection depth 
# of a simple two-dimensional dataset.
# Outliers are plotted in red.

data(bloodfat)
Result <- sprojdepth(x = bloodfat)
IndOutliers <- which(!Result$flagX)
plot(bloodfat)
points(bloodfat[IndOutliers,], col = "red")

# A multivariate rainbowplot may be obtained using mrainbowplot.
plot.options = list(legend.title = "SPD")
mrainbowplot(x = bloodfat, 
             depths = Result$depthX, plot.options = plot.options)

# Options for the underlying outlyingness routine may be passed 
# using the options argument. 
Result <- sprojdepth(x = bloodfat, 
                     options = list(type = "Affine",
                                    ndir = 1000,
                                    seed = 12345
                                    )
                    )
# }

Run the code above in your browser using DataCamp Workspace