Detects outliers in one dimensional data, based on the assumption
that the bulk of (the right side of) the observed data distribution can
be adequately described by a model distribution.
A value $y_i$ is an outlier if it is above the limit where less then
rho observations are expected. Must be >=0.
pval
c(pmin,pmax) quantile limits indicating which data should be used
to fit the model distribution. Must obey 0 < pmin < pmax < 1.
method
Model distributiun used to estimate the limit. Choose from
"lognormal" (default), "exponential", "pareto", "weibull" or "normal".
Value
iOutIndex vector indicating where y > limit
nOutNumber of outliers. The largest nOut values of y are outliers
limitOutlier limit. Elements of y larger then or equal to limit are considered outliers
NpopLength of y
methodmethod
rhoThe rho-value
pminpval[1]
pmaxpval[2]
NfitNumber of values used in the fit
R2R-squared value for the fit
lambda(exponential distribution) Estimated location (and spread) parameter for $f(y)=\lambda\exp(-\lambda y)$
mu(lognormal distribution) Estimated $E(\ln(y))$ for lognormal distribution
sigma(lognormal distribution) Estimated $Var(ln(y))$ for lognormal distribution
ym(pareto distribution) Estimated location parameter (mode) for pareto distribution
alpha(pareto distribution) Estimated spread parameter for pareto distribution
k(weibull distribution) estimated shape parameter $k$ for weibull distribution
lambda(weibull distribution) estimated scale parameter $\lambda$ for weibull distribution
mu(normal distribution) Estimated $E(y)$ for normal distribution
sigma(normal distribution) Estimated $Var(y)$ for normal distribution
Details
The function sorts the values of y and uses (log)linear regression to fit
the values between the pmin and pmax quantile to the cdf
of a model distribution. Given a model cdf $F$, the outlier limit $l$
is the value above which less than $\rho$ values are expected,
conditional on the total number
of observations in $y$: $l=F^{-1}(1-\rho/N|\hat{\theta})$. Here,
$\theta$ are the cdf's estimated parameters.
References
An outlier detection method for economic data, M.P.J. van der
Loo, Submitted to The Journal of Official Statistics (November 2009)
The file /R-/library/extremevalues/extremevalues.pdf
contains a worked example. It can also be downloaded from my website.