When method="resistant"
the outlying observations are those outside the interval:
$$[Q_1 - k \times IQR;\quad Q_3 + k \times IQR] $$
where \(Q_1\) and \(Q_3\) are respectively the 1st and the 3rd quartile of x
, while \(IQR=(Q_3 - Q_1)\) is the Inter-Quartile Range. The value \(k=1.5\) (said 'inner fences') is commonly used when drawing a boxplot. Values \(k=2\) and \(k=3\) provide middle and outer fences, respectively.
When method="asymmetric"
the outlying observations are those outside the interval:
$$[Q_1 - 2k \times (Q_2-Q_1);\quad Q_3 + 2k \times (Q_3-Q_2)] $$
being \(Q_2\) the median; such a modification allows to account for slight skewness of the distribution.
Finally, when method="adjbox"
the outlying observations are identified using the method proposed by Hubert and Vandervieren (2008) and based on the Medcouple measure of skewness; in practice the bounds are:
$$[Q_1-1.5 \times e^{aM} \times IQR;\quad Q_3+1.5 \times e^{bM}\times IQR ]$$
Where M is the medcouple; when \(M > 0\) (positive skewness) then \(a = -4\) and \(b = 3\); on the contrary \(a = -3\) and \(b = 4\) for negative skewness (\(M < 0\)). This adjustment of the boxplot, according to Hubert and Vandervieren (2008), works with moderate skewness (\(-0.6 \leq M \leq 0.6\)). The bounds of the adjusted boxplot are derived by applying the function adjboxStats
in the package robustbase.
When weights are available (passed via the argument weights
) then they are used in the computation of the quartiles. In particular, the quartiles are derived using the function wtd.quantile
in the package Hmisc.
Remember that when asking a log transformation (argument logt=TRUE
) all the estimates (quartiles, etc.) will refer to \(log(x+1)\).