The function `is.outlier` checks whether any time series observations are outliers based on the interquartile range (IQR) rule.
is.outlier(x, method = c("iqr", "sigma", "zscore"), param = NULL)A vector indicating whether the values in x are outliers (TRUE) or not (FALSE).
a time series or any other R data type.
method based on which the outliers are identified. Available options are `iqr`, `sigma`, and `zscore`.
parameter value for setting specific boundary criteria. Default is NULL.
Ka Yui Karl Wu
With method = "iqr", the interquartile range rule for outlier identification is applied. An observation \(x_i\) will be identified as outlier if one of the following conditions fulfils:
$$x_i < q_1 - m \cdot (q_3 - q_1)$$
$$x_i > q_3 + m \cdot (q_3 - q_1)$$
where \(q_1\) and \(q_3\) are the 1st and 3rd quartiles of the time series x, respectively. m is the value specified by param. If omitted, it will be set as 1.5.
By using method = "sigma", the following criteria for outlier identification, known as the 3-sigma rule, are applied:
$$x_i < \mu(x) - m \cdot \sigma(x)$$
$$x_i > \mu(x) + m \cdot \sigma(x)$$
where \(\mu(x)\) and \(\sigma(x)\) are the mean and standard deviation of the time series x, respectively. m is the value specified by param. If omitted, it will be set as 3.
The z-score rule, specified by method = "zscore", compares the standardised observation values to a specific threshold:
$$\left|\dfrac{(x_i - \mu(x))}{\sigma(x)}\right| > m$$
where \(\mu(x)\) and \(\sigma(x)\) are the mean and standard deviation of the time series x, respectively. m is the value specified by param. If omitted, it will be set as 2. Note that 2 is the threshold for mild outliers. If checking for extreme outliers is required, the value should be set as 3.
is.outlier(airport$Travellers, method = "zscore")
Run the code above in your browser using DataLab