BLlo: Barnett and Lewis Test Adjusted for Low Outliers

Description

The Barnett and Lewis (1995, p. 224; $T_{\mathrm{N}3}$) so-labeled “N3 method” with TAC adjustment to look for low outliers. The essence of the method, given the order statistics $x_{[1:n]} \le x_{[2:n]} \le \cdots \le x_{[(n-1):n]} \le x_{[n:n]}$, is the statistic $$BL_r = T_{\mathrm{N}3} = \frac{ \sum_{i=1}^r x_{[i:n]} - r \times \mathrm{mean}\{x_{[1:n]}\} } {\sqrt{\mathrm{var}\{x_{[1:n]}\}}}\mbox{,}$$ for the mean and variance of the observations. Barnett and Lewis (1995, p. 218) brand this statistic as a test of the “$k \ge 2$ upper outliers” but for the MGBT package “lower” applies in TAC reformulation. Barnett and Lewis (1995, p. 218) show an example of a modification for two low outliers as $(2\overline{x} - x_{[2:n]} - x_{[1:n]})/s$ for the mean $\mu$ and standard deviation $s$. TAC reformulation thus differs by a sign. The $BL_r$ is a sum of internally studentized deviations from the mean: $$SP(t) \le {n \choose k} P\biggl(\bm{t}(n-2) > \biggr[\frac{n(n-2)t^2}{r(n-r)(n-1)-nt^2}\biggl]^{1/2}\biggr)\mbox{,}$$ where $\bm{t}(df)$ is the t-distribution for $df$ degrees of freedom, and this is an inequality when $$t \ge \sqrt{r^2(n-1)(n-r-1)/(nr+n)}\mbox{,}$$ where $SP(t)$ is the probability that $T_{\mathrm{N}3} > t$ when the inequality holds. For reference, Barnett and Lewis (1995, p. 491) example tables of critical values for $n=10$ for $k \in 2,3,4$ at 5-percent significant level are $3.18$, $3.82$, and $4.17$, respectively. One of these is evaluated in the Examples.

Usage

BLlo(x, r, n=length(x))

Arguments

The data values and note that base-10 logarithms of these are not computed internally;

The number of truncated observations; and

The number of observations.

Value

The value for $BL_r$.

References

Barnett, Vic, and Lewis, Toby, 1995, Outliers in statistical data: Chichester, John Wiley and Sons, ISBN~0--471--93094--6.

Cohn, T.A., 2013--2016, Personal communication of original R source code: U.S. Geological Survey, Reston, Va.

Examples

Run this code

# NOT RUN {
# See Examples under RSlo()

# }
# NOT RUN {
 # WHA experiments with BL_r()
n <- 10; r <- 3; nsim <- 10000; alpha <- 0.05; Tcrit <- 3.82
BLs <- Ho <- RHS <- SPt <- rep(NA, nsim)
EQ <- sqrt(r^2*(n-1)*(n-r-1)/(n*r+n))
for(i in 1:nsim) { # some simulation results shown below
   BLs[i] <- abs(BLlo(rnorm(n), r)) # abs() correcting TAC sign convention
   t  <- sqrt( (n*(n-2)*BLs[i]^2) / (r*(n-r)*(n-1)-n*BLs[i]^2) )
   RHS[i] <- choose(n,r)*pt(t, n-2, lower.tail=FALSE)
   ifelse(t >= EQ, SPt[i] <- RHS[i], SPt[i] <- 1) # set SP(t) to unity?
   Ho[i]  <- BLs[i] > Tcrit
}
results <- c(quantile(BLs, prob=1-alpha), sum(Ho /nsim), sum(SPt < alpha)/nsim)
names(results) <- c("Critical_value", "Ho_rejected", "Coverage_SP(t)")
print(results) # minor differences are because of random number seeding
# Critical_value    Ho_rejected Coverage_SP(t)
#      3.817236       0.048200       0.050100 
# }