cellWise (version 2.1.1)

estLocScale: Estimate robust location and scale

Description

Estimate a robust location estimate and scale estimate of every column in X.

Usage

estLocScale(X, type = "wrap", precScale = 1e-12,
center = TRUE, alpha = 0.5, nLocScale = 25000, silent = FALSE)

Arguments

X

The input data. It must be an \(n\) by \(d\) matrix or a data frame.

type

The type of estimators used. One of:

  • "1stepM": The location is the 1-step M-estimator with the biweight psi function. The scale estimator is the 1-step M-estimator using a Huber rho function with \(b = 2.5\).

  • "mcd": the location is the weighted univariate MCD estimator with cutoff \(\sqrt(qchisq(0.975,1))\). The scale is the corresponding weighted univariate MCD estimator, with a correction factor to make it approximately unbiased at gaussian data.

  • "wrap": Starting from the initial estimates corresponding to option "mcd", the location is the 1-step M-estimator with the wrapping psi function with \(b = 1.5\) and \(c = 4\). The scale estimator is the same as in option "mcd".

Defaults to "wrap".

precScale

The precision scale used throughout the algorithm. Defaults to \(1e-12\).

center

Whether or not the data has to be centered before calculating the scale. Not in use for type = "mcd". Defaults to TRUE.

alpha

The value of \(\alpha\) in the univariate mcd, must be between 0.5 and 1. The subsetsize is \(h = \lceil \alpha n \rceil\). Only used for type = "mcd". Defaults to \(\alpha = 0.5\).

nLocScale

If nLocScale \(< n\), nLocScale observations are sampled to compute the location and scale. This speeds up the computation if \(n\) is very large. When nLocScale \(= 0\) all observations are used. Defaults to nLocScale \(= 25000\).

silent

Whether or not a warning message should be printed when very small scales are found. Defauts to FALSE.

Value

A list with components:

  • loc A vector with the estimated locations.

  • scale A vector with the estimated scales.

References

Raymaekers, J., Rousseeuw P.J. (2019). Fast robust correlation for high dimensional data. Technometrics, published online.

See Also

wrap

Examples

Run this code
# NOT RUN {
library(MASS) 
set.seed(12345) 
n = 100; d = 10
X = mvrnorm(n, rep(0, 10), diag(10))
locScale = estLocScale(X)
# }

Run the code above in your browser using DataCamp Workspace