calout.detect(x, alpha = 0.05, method = c("GESD", "boxplot", "medmad", "shorth", "hybrid"), k = ((length(x)%%2) * floor(length(x)/2) + (1 - (length(x)%%2)) * (length(x)/2 - 1)), scaling, ftype, location, scale, gen.region = function(x, location, scale, scaling, alpha) { g <- scaling(length(x), alpha) location(x) + c(-1, 1) * g * scale(x) })
An important characteristic of the GESD procedure is that the critical values for outlier labeling are calibrated to preserve the overall Type I error rate of the procedure given that there will be k tests, whether or not any outliers are present in the data.
If method=="boxplot", the default value scaling=box.scale will confine the probability of erroneous detection of one or more outliers in a pure Gaussian sample to alpha. The use of scaling=function(n,alpha) 1.5 gives the standard boxplot outlier labeling rule.
If method=="medmad", the use of scaling=hamp.scale.4 will confine the outlier mislabeling rate to alpha; whereas the use of scaling=function(n,alpha) 5.2 gives Hampel's rule (Davies and Gather, 1993, p. 790).
If method=="shorth", the default value scaling=shorth.scale will confine the outlier mislabeling rate to alpha.
lead <- c(83, 70, 62, 55, 56, 57, 57, 58, 59, 50, 51, 52, 52, 52, 54, 54, 45, 46, 48,
48, 49, 40, 40, 41, 42, 42, 44, 44, 35, 37, 38, 38, 34, 13, 14)
calout.detect(lead,alpha=.05,method="boxplot",ftype="ideal")
calout.detect(lead,alpha=.05,method="GESD",k=5)
calout.detect(lead,alpha=.05,method="medmad",scaling=hamp.scale.3)
calout.detect(lead,alpha=.05,method="shorth")
Run the code above in your browser using DataLab