genefilter (version 1.48.1)

shorth: A location estimator based on the shorth

Description

A location estimator based on the shorth

Usage

shorth(x, na.rm=FALSE, tie.action="mean", tie.limit=0.05)

Arguments

x
Numeric
na.rm
Logical. If TRUE, then non-finite (according to is.finite) values in x are ignored. Otherwise, presence of non-finite or NA values will lead to an error message.
tie.action
Character scalar. See details.
tie.limit
Numeric scalar. See details.

Value

x values that lie in the shorth.

Details

The shorth is the shortest interval that covers half of the values in x. This function calculates the mean of the x values that lie in the shorth. This was proposed by Andrews (1972) as a robust estimator of location.

Ties: if there are multiple shortest intervals, the action specified in ties.action is applied. Allowed values are mean (the default), max and min. For mean, the average value is considered; however, an error is generated if the start indices of the different shortest intervals differ by more than the fraction tie.limit of length(x). For min and max, the left-most or right-most, respectively, of the multiple shortest intervals is considered.

Rate of convergence: as an estimator of location of a unimodal distribution, under regularity conditions, the quantity computed here has an asymptotic rate of only $n^{-1/3}$ and a complicated limiting distribution.

See half.range.mode for an iterative version that refines the estimate iteratively and has a builtin bootstrapping option.

References

  • G Sawitzki, “The Shorth Plot.” Available at http://lshorth.r-forge.r-project.org/TheShorthPlot.pdf

  • DF Andrews, “Robust Estimates of Location.” Princeton University Press (1972).

  • R Grueble, “The Length of the Shorth.” Annals of Statistics 16, 2:619-628 (1988).

  • DR Bickel and R Fruehwirth, “On a fast, robust estimator of the mode: Comparisons to other robust estimators with applications.” Computational Statistics & Data Analysis 50, 3500-3530 (2006).

See Also

half.range.mode

Examples

Run this code
 
  x = c(rnorm(500), runif(500) * 10)
  methods = c("mean", "median", "shorth", "half.range.mode")
  ests = sapply(methods, function(m) get(m)(x))

  if(interactive()) {
    colors = 1:4
    hist(x, 40, col="orange")
    abline(v=ests, col=colors, lwd=3, lty=1:2)
    legend(5, 100, names(ests), col=colors, lwd=3, lty=1:2) 
  }

Run the code above in your browser using DataLab