FUNOP stands for FUll NOrmal Plot.
The procedure identifies outliers by calculating their slope (z
),
relative to the vector's median.
The procedure ignores values in the middle third of the ordered vector. The remaining values are all candidates for consideration. The slopes of all candidates are calculated, and the median of their slopes is used as the primary basis for identifying outliers.
Any value whose slope is B
times larger than the median slope is
identified as an outlier. Additionally, any value whose magnitude
is larger than that of the slope-based outliers is also identified as
an outlier.
However, the procedure will not identify as outliers any values
within A
standard deviations of the vector's median (i.e., not
the median of candidate slopes).
funop(x, A = 0, B = 1.5)
Numeric vector to inspect for outliers (does not need to be ordered)
Number of standard deviations beyond the median of x
Multiples beyond the median slope of candidate values
A data frame containing one row for every member of x
(in the same
order as x
) and the
following columns:
y
: Original values of vector x
i
: Ordinal position of value y
in the sorted vector x
middle
: Boolean indicating whether ordinal position i
is in the middle third of the vector
a
: Result of a_qnorm(i, length(x))
z
: Slope of y
relative to median(y)
special
: Boolean indicating whether y
is an outlier
Tukey, John W. "The Future of Data Analysis." The Annals of Mathematical Statistics, 33(1), 1962, pp 1-67. JSTOR, https://www.jstor.org/stable/2237638.
# NOT RUN {
funop(c(1, 2, 3, 11))
funop(table_1)
attr(funop(table_1), 'z_split')
# }
Run the code above in your browser using DataLab