Learn R Programming

tsforecast (version 1.3.0)

is.outlier: Outlier Identification

Description

The function `is.outlier` checks whether any time series observations are outliers based on the interquartile range (IQR) rule.

Usage

is.outlier(x, method = c("iqr", "sigma", "zscore"), param = NULL)

Value

A vector indicating whether the values in x are outliers (TRUE) or not (FALSE).

Arguments

x

a time series or any other R data type.

method

method based on which the outliers are identified. Available options are `iqr`, `sigma`, and `zscore`.

param

parameter value for setting specific boundary criteria. Default is NULL.

Author

Ka Yui Karl Wu

Details

With method = "iqr", the interquartile range rule for outlier identification is applied. An observation \(x_i\) will be identified as outlier if one of the following conditions fulfils:

$$x_i < q_1 - m \cdot (q_3 - q_1)$$

$$x_i > q_3 + m \cdot (q_3 - q_1)$$

where \(q_1\) and \(q_3\) are the 1st and 3rd quartiles of the time series x, respectively. m is the value specified by param. If omitted, it will be set as 1.5.

By using method = "sigma", the following criteria for outlier identification, known as the 3-sigma rule, are applied:

$$x_i < \mu(x) - m \cdot \sigma(x)$$

$$x_i > \mu(x) + m \cdot \sigma(x)$$

where \(\mu(x)\) and \(\sigma(x)\) are the mean and standard deviation of the time series x, respectively. m is the value specified by param. If omitted, it will be set as 3.

The z-score rule, specified by method = "zscore", compares the standardised observation values to a specific threshold:

$$\left|\dfrac{(x_i - \mu(x))}{\sigma(x)}\right| > m$$

where \(\mu(x)\) and \(\sigma(x)\) are the mean and standard deviation of the time series x, respectively. m is the value specified by param. If omitted, it will be set as 2. Note that 2 is the threshold for mild outliers. If checking for extreme outliers is required, the value should be set as 3.

Examples

Run this code
is.outlier(airport$Travellers, method = "zscore")

Run the code above in your browser using DataLab