dixonTest (version 1.0.0)

dixonTest: Dixons Outlier Test (Q-Test)

Description

Performs Dixons single outlier test.

Usage

dixonTest(x, alternative = c("two.sided", "greater", "less"),
  refined = FALSE)

Arguments

x

a numeric vector of data

alternative

the alternative hypothesis. Defaults to "two.sided"

refined

logical indicator, whether the refined version or the Q-test shall be performed. Defaults to FALSE

Details

Let \(X\) denote an identically and independently distributed normal variate. Further, let the increasingly ordered realizations denote \(x_1 \le x_2 \le \ldots \le x_n\). Dixon (1950) proposed the following ratio statistic to detect an outlier (two sided):

$$ r_{j,i-1} = \max\left\{\frac{x_n - x_{n-j}}{x_n - x_i}, \frac{x_{1+j} - x_1}{x_{n-i} - x_1}\right\}$$

The null hypothesis, no outlier, is tested against the alternative, at least one observation is an outlier (two sided). The subscript \(j\) on the \(r\) symbol indicates the number of outliers that are suspected at the upper end of the data set, and the subscript \(i\) indicates the number of outliers suspected at the lower end. For \(r_{10}\) it is also common to use the statistic \(Q\).

The statistic for a single maximum outlier is: $$ r_{j,i-1} = \left(x_n - x_{n-j} \right) / \left(x_n - x_i\right)$$ The null hypothesis is tested against the alternative, the maximum observation is an outlier.

For testing a single minimum outlier, the test statistic is: $$ r_{j,i-1} = \left(x_{1+j} - x_1 \right) / \left(x_{n-i} - x_1 \right)$$

The null hypothesis is tested against the alternative, the minimum observation is an outlier.

Apart from the earlier Dixons Q-test (i.e. \(r_{10}\)), a refined version that was later proposed by Dixon can be performed with this function, where the statistic \(r_{j,i-1}\) depends on the sample size as follows:

\(r_{10}\): \(3 \le n \le 7\)
\(r_{11}\): \(8 \le n \le 10\)
\(r_{21}\); \(11 \le n \le 13\)
\(r_{22}\): \(14 \le n \le 30\)

The p-value is computed with the function pdixon.

References

Dixon, W. J. (1950) Analysis of extreme values. Ann. Math. Stat. 21, 488--506. http://dx.doi.org/10.1214/aoms/1177729747.

Dean, R. B., Dixon, W. J. (1951) Simplified statistics for small numbers of observation. Anal. Chem. 23, 636--638. http://dx.doi.org/10.1021/ac60052a025.

McBane, G. C. (2006) Programs to compute distribution functions and critical values for extreme value ratios for outlier detection. J. Stat. Soft. 16. http://dx.doi.org/10.18637/jss.v016.i03.

Examples

Run this code
# NOT RUN {
## example from Dean and Dixon 1951, Anal. Chem., 23, 636-639.
x <- c(40.02, 40.12, 40.16, 40.18, 40.18, 40.20)
dixonTest(x, alternative = "two.sided")

## example from the dataplot manual of NIST
x <- c(568, 570, 570, 570, 572, 578, 584, 596)
dixonTest(x, alternative = "greater", refined = TRUE)

# }

Run the code above in your browser using DataCamp Workspace