genefilter (version 1.54.2)

shorth: A location estimator based on the shorth

Description

A location estimator based on the shorth

Usage

shorth(x, na.rm=FALSE, tie.action="mean", tie.limit=0.05)

Arguments

x
Numeric
na.rm
Logical. If TRUE, then non-finite (according to is.finite) values in x are ignored. Otherwise, presence of non-finite or NA values will lead to an error message.
tie.action
Character scalar. See details.
tie.limit
Numeric scalar. See details.

Value

  • The mean of the x values that lie in the shorth.

Details

The shorth is the shortest interval that covers half of the values in x. This function calculates the mean of the x values that lie in the shorth. This was proposed by Andrews (1972) as a robust estimator of location.

Ties: if there are multiple shortest intervals, the action specified in ties.action is applied. Allowed values are mean (the default), max and min. For mean, the average value is considered; however, an error is generated if the start indices of the different shortest intervals differ by more than the fraction tie.limit of length(x). For min and max, the left-most or right-most, respectively, of the multiple shortest intervals is considered.

Rate of convergence: as an estimator of location of a unimodal distribution, under regularity conditions, the quantity computed here has an asymptotic rate of only $n^{-1/3}$ and a complicated limiting distribution.

See half.range.mode for an iterative version that refines the estimate iteratively and has a builtin bootstrapping option.

References

  • G Sawitzki,The Shorth Plot.Available at http://lshorth.r-forge.r-project.org/TheShorthPlot.pdf
  • DF Andrews,Robust Estimates of Location.Princeton University Press (1972).
  • R Grueble,The Length of the Shorth.Annals of Statistics 16, 2:619-628 (1988).
  • DR Bickel and R Fruehwirth,On a fast, robust estimator of the mode: Comparisons to other robust estimators with applications.Computational Statistics & Data Analysis 50, 3500-3530 (2006).

See Also

half.range.mode

Examples

Run this code
x = c(rnorm(500), runif(500) * 10)
  methods = c("mean", "median", "shorth", "half.range.mode")
  ests = sapply(methods, function(m) get(m)(x))

  if(interactive()) {
    colors = 1:4
    hist(x, 40, col="orange")
    abline(v=ests, col=colors, lwd=3, lty=1:2)
    legend(5, 100, names(ests), col=colors, lwd=3, lty=1:2) 
  }

Run the code above in your browser using DataCamp Workspace