Learn R Programming

robslopes (version 1.1.3)

RepeatedMedian: Siegel's repeated median slope and intercept estimator.

Description

Computes the repeated median slope proposed by Siegel (1982) using the algorithm by Matousek et. al (1998). The algorithm runs in an expected \(O(n (log n)^2)\) time, which is typically significantly faster than the \(O(n^2)\) computational cost of the naive algorithm, and requires \(O(n)\) storage.

Usage

RepeatedMedian(x, y, alpha = NULL, beta = NULL, verbose = TRUE)

Value

A list with elements:

intecept

The estimate of the intercept.

slope

The Theil-Sen estimate of the slope.

Arguments

x

A vector of predictor values.

y

A vector of response values.

alpha

Determines the outer order statistic, which is equal to \([alpha*n]\), where \(n\) denotes the sample size. Defaults to NULL, which corresponds with the (upper) median.

beta

Determines the inner order statistic, which is equal to \([beta*(n-1)]\), where \(n\) denotes the sample size. Defaults to NULL, which corresponds with the (upper) median.

verbose

Whether or not to print out the progress of the algorithm. Defaults to TRUE.

Author

Jakob Raymaekers

Details

Given two input vectors x and y of length \(n\), the repeated median is computed as \(med_i med_j (y_i - y_j)/(x_i-x_j)\). The default "outer'' median is the \(\lfloor (n + 2) / 2 \rfloor\) largest element in the ordered median slopes. The inner median, which for each observation is calculated as the median of the slopes connected to this observation, is the \(\lfloor (n +1) / 2 \rfloor\) largest element in the ordered slopes. By changing alpha and beta, other repeated order statistics of the slopes can be computed.

References

Siegel, A. F. (1982). Robust regression using repeated medians. Biometrika, 69(1), 242-244.

Matousek, J., Mount, D. M., & Netanyahu, N. S. (1998). Efficient randomized algorithms for the repeated median line estimator. Algorithmica, 20(2), 136-150.

Raymaekers (2023). "The R Journal: robslopes: Efficient Computation of the (Repeated) Median Slope", The R Journal. (link to open access pdf)

See Also

TheilSen

Examples

Run this code

# We compare the implemented algorithm against a naive brute-force approach.

bruteForceRM <- function(x, y) {
  
  n <- length(x)
  medind1 <- floor((n+2) / 2)
  medind2 <- floor((n+1) / 2)
  temp <-  t(sapply(1:n, function(z)  sort(apply(cbind(x, y), 1 ,
                                                  function(k) (k[2] - y[z]) /
                                                    (k[1] - x[z])))))
  RMslope <- sort(temp[, medind2])[medind1]
  RMintercept <- sort(y - x * RMslope)[medind1]
  return(list(intercept = RMintercept, slope = RMslope))
}

n = 100
set.seed(2)
x = rnorm(n)
y = x + rnorm(n)

t0 <- proc.time()
RM.fast <- RepeatedMedian(x, y, NULL, NULL, FALSE)
t1 <- proc.time()
t1 - t0

t0 <- proc.time()
RM.naive <- bruteForceRM(x, y)
t1 <- proc.time()
t1 - t0

RM.fast$slope - RM.naive$slope

Run the code above in your browser using DataLab