Learn R Programming

robscale (version 0.1.1)

robLoc: Robust M-Estimate of Location

Description

Compute the robust M-estimate of location for very small samples using the logistic \(\psi\) function of Rousseeuw & Verboven (2002).

Usage

robLoc(
  x,
  scale = NULL,
  na.rm = FALSE,
  maxit = 80L,
  tol = sqrt(.Machine$double.eps)
)

Value

A single numeric value: the robust M-estimate of location. Returns NA if x has length zero (after removal of

NAs when na.rm = TRUE).

Arguments

x

A numeric vector.

scale

Optional numeric scalar giving a known scale. When supplied, the MAD is replaced by this value and the minimum sample size for iteration is lowered from 4 to 3 (see ‘Details’).

na.rm

Logical. If TRUE, NA values are stripped from x before computation. If FALSE (the default), the presence of any NA raises an error.

maxit

Maximum number of Newton--Raphson iterations. Defaults to 80.

tol

Convergence tolerance. Iteration stops when the absolute Newton step falls below tol. Defaults to sqrt(.Machine$double.eps).

Details

The location estimator \(T_n\) is the solution to the M-estimating equation

$$\frac{1}{n}\sum_{i=1}^{n}\psi_{\mathrm{log}} \!\left(\frac{x_i - T_n}{S_n}\right) = 0$$

where \(S_n\) is a fixed auxiliary scale estimate and \(\psi_{\mathrm{log}}\) is the logistic psi function (Rousseeuw & Verboven 2002, Eq. 23):

$$\psi_{\mathrm{log}}(x) = \frac{e^x - 1}{e^x + 1} = \tanh(x/2)$$

This function is bounded in \((-1, 1)\), smooth (\(C^\infty\)), and strictly monotone. Boundedness provides robustness against outliers; smoothness avoids the corner artifacts of Huber's \(\psi\) at small \(n\).

Iteration scheme. The estimating equation is solved by Newton--Raphson iteration. The derivative of the logistic psi satisfies \(\psi'(x) = 1 - \psi^2(x)\), so the Newton step requires no additional transcendental function evaluations beyond those already computed for the numerator. Starting value: \(T^{(0)} = \mathrm{med}(x)\). Auxiliary scale: \(S = \mathrm{MAD}(x)\) unless scale is supplied.

Decoupled estimation. Location and scale are estimated separately: robLoc holds the auxiliary scale fixed at \(\mathrm{MAD}(x)\) throughout iteration, following the decoupled approach of Rousseeuw & Verboven (2002, Sec. 4.1). This avoids the positive-feedback instability of simultaneous location--scale iteration (Huber's Proposal 2) in small samples.

Fallback. When the sample is too small for reliable iteration the function returns the median directly:

  • \(n < 4\) when scale is unknown (the MAD is unreliable at \(n = 3\));

  • \(n < 3\) when scale is known.

References

Rousseeuw, P. J. and Verboven, S. (2002) Robust estimation in very small samples. Computational Statistics & Data Analysis, 40(4), 741--758. tools:::Rd_expr_doi("10.1016/S0167-9473(02)00078-6")

See Also

median for the starting value; mad for the auxiliary scale; robScale for the companion scale estimator.

Examples

Run this code
robLoc(c(1:9))

x <- c(1, 2, 3, 5, 7, 8)
robLoc(x)

# Known scale lowers the minimum sample size to 3
robLoc(c(1, 2, 3), scale = 1.5)

# Outlier resistance
x_clean <- c(2.0, 3.1, 2.7, 2.9, 3.3)
x_dirty <- replace(x_clean, 5, 100)
c(robLoc(x_clean), robLoc(x_dirty))   # barely moves
c(mean(x_clean), mean(x_dirty))       # destroyed

Run the code above in your browser using DataLab