dixonTest: Dixons Outlier Test (Q-Test)

Description

Performs Dixons single outlier test.

Usage

dixonTest(x, alternative = c("two.sided", "greater", "less"),
  refined = FALSE)

Arguments

a numeric vector of data

alternative

the alternative hypothesis. Defaults to "two.sided"

refined

logical indicator, whether the refined version or the Q-test shall be performed. Defaults to FALSE

Details

Let $X$ denote an identically and independently distributed normal variate. Further, let the increasingly ordered realizations denote $x_1 \le x_2 \le \ldots \le x_n$. Dixon (1950) proposed the following ratio statistic to detect an outlier (two sided):

$$ r_{j,i-1} = \max\left\{\frac{x_n - x_{n-j}}{x_n - x_i}, \frac{x_{1+j} - x_1}{x_{n-i} - x_1}\right\}$$

The null hypothesis, no outlier, is tested against the alternative, at least one observation is an outlier (two sided). The subscript $j$ on the $r$ symbol indicates the number of outliers that are suspected at the upper end of the data set, and the subscript $i$ indicates the number of outliers suspected at the lower end. For $r_{10}$ it is also common to use the statistic $Q$.

The statistic for a single maximum outlier is: $$ r_{j,i-1} = \left(x_n - x_{n-j} \right) / \left(x_n - x_i\right)$$ The null hypothesis is tested against the alternative, the maximum observation is an outlier.

For testing a single minimum outlier, the test statistic is: $$ r_{j,i-1} = \left(x_{1+j} - x_1 \right) / \left(x_{n-i} - x_1 \right)$$

The null hypothesis is tested against the alternative, the minimum observation is an outlier.

Apart from the earlier Dixons Q-test (i.e. $r_{10}$), a refined version that was later proposed by Dixon can be performed with this function, where the statistic $r_{j,i-1}$ depends on the sample size as follows:

$r_{10}$:	$3 \le n \le 7$
$r_{11}$:	$8 \le n \le 10$
$r_{21}$;	$11 \le n \le 13$
$r_{22}$:	$14 \le n \le 30$

The p-value is computed with the function pdixon.

References

Dixon, W. J. (1950) Analysis of extreme values. Ann. Math. Stat. 21, 488--506. http://dx.doi.org/10.1214/aoms/1177729747.

Dean, R. B., Dixon, W. J. (1951) Simplified statistics for small numbers of observation. Anal. Chem. 23, 636--638. http://dx.doi.org/10.1021/ac60052a025.

McBane, G. C. (2006) Programs to compute distribution functions and critical values for extreme value ratios for outlier detection. J. Stat. Soft. 16. http://dx.doi.org/10.18637/jss.v016.i03.

Examples

Run this code

# NOT RUN {
## example from Dean and Dixon 1951, Anal. Chem., 23, 636-639.
x <- c(40.02, 40.12, 40.16, 40.18, 40.18, 40.20)
dixonTest(x, alternative = "two.sided")

## example from the dataplot manual of NIST
x <- c(568, 570, 570, 570, 572, 578, 584, 596)
dixonTest(x, alternative = "greater", refined = TRUE)

# }

Run the code above in your browser using DataCamp Workspace

\(r_{10}\):	\(3 \le n \le 7\)
\(r_{11}\):	\(8 \le n \le 10\)
\(r_{21}\);	\(11 \le n \le 13\)
\(r_{22}\):	\(14 \le n \le 30\)