This function applies the \(t\)-statistics for the significance of outliers at every time point and selects those that are significant given a critical value.
locate.outliers(resid, pars, cval = 3.5, types = c("AO", "LS", "TC"),
delta = 0.7, n.start = 50)
a time series. Residuals from a time series model fitted to the data.
a list containing the parameters of the model fitted to the data. See details below.
a numeric. The critical value to determine the significance of each type of outlier.
a character vector indicating the types of outliers to be considered.
a numeric. Parameter of the temporary change type of outlier.
a numeric. The number of warming observations added to the
input passed to the Kalman filter. Only for "stsm"
.
A data frame defining by rows the potential set of outliers. The type of outlier, the observation, the coefficient and the \(t\)-statistic are given by columns respectively for each outlier.
Five types of outliers can be considered.
By default: "AO"
additive outliers, "LS"
level shifts,
and "TC"
temporary changes are selected;
"IO"
innovative outliers and "SLS"
seasonal level shifts
can also be selected.
The approach described in Chen & Liu (1993) is followed to locate outliers. The original framework is based on ARIMA time series models. The extension to structural time series models is currently experimental.
Let us define an ARIMA model for the series \(y_t^*\) subject to \(m\) outliers defined as \(L_j(B)\) with weights \(w\):
$$y_t^* = \sum_{j=1}^m \omega_j L_j(B) I_t(t_j) + \frac{\theta(B)}{\phi(B) \alpha(B)} a_t \,,$$
where \(I_t(t_j)\) is an indicator variable containing the value \(1\) at observation \(t_j\) where the \(j\)-th outlier arises; \(\phi(B)\) is an autoregressive polynomial with all roots outside the unit circle; \(\theta(B)\) is a moving average polynomial with all roots outside the unit circle; and \(\alpha(B)\) is an autoregressive polynomial with all roots on the unit circle.
The presence of outliers is tested by means of \(t\)-statistics applied on the following regression equation:
$$\pi(B) y_t^* \equiv \hat{e}_t = \sum_{j=1}^m \omega_j \pi(B) L_j(B) I_t(t_j) + a_t \,.$$
where \(\pi(B) = \sum_{i=0}^\infty \pi_i B^i\).
The regressors of the above equation are created by the functions
outliers.regressors.arima
and the remaining functions described here.
The function locate.outliers
computes all the \(t\)-statistics for each type of
outlier and for every time point. See outliers.tstatistics
.
Then, the cases where the corresponding \(t\)-statistic are (in absolute value)
below the threshold cval
are removed. Thus, a potential set of outliers is obtained.
Some polishing rules are applied by locate.outliers
:
If level shifts are found at consecutive time points, only then point with higher \(t\)-statistic in absolute value is kept.
If more than one type of outlier exceed the threshold cval
at a given time point,
the type of outlier with higher \(t\)-statistic in absolute value is kept and the others
are removed.
The argument pars
is a list containing the parameters of the model.
In the framework of ARIMA models, the coefficients of the ARIMA must be defined in pars
as the product of the autoregressive non-seasonal and seasonal polynomials (if any) and the
differencing filter (if any). The function coefs2poly
can be used to define
the argument pars
.
For structural time series models that are fitted by means of the package stsm,
the function
char2numeric
available in the
stsm package is the most convenient way to define the argument pars
.
Chen, C. and Liu, Lon-Mu (1993). ‘Joint Estimation of Model Parameters and Outlier Effects in Time Series’. Journal of the American Statistical Association, 88(421), pp. 284-297.
G<U+00F3>mez, V. and Maravall, A. (1996). Programs TRAMO and SEATS. Instructions for the user. Banco de Espa<U+00F1>a, Servicio de Estudios. Working paper number 9628. http://www.bde.es/f/webbde/SES/Secciones/Publicaciones/PublicacionesSeriadas/DocumentosTrabajo/96/Fich/dt9628e.pdf
G<U+00F3>mez, V. and Taguas, D. (1995). Detecci<U+00F3>n y Correcci<U+00F3>n Autom<U+00E1>tica de Outliers con TRAMO: Una Aplicaci<U+00F3>n al IPC de Bienes Industriales no Energ<U+00E9>ticos. Ministerio de Econom<U+00ED>a y Hacienda. Document number D-95006. http://www.sepg.pap.minhap.gob.es/sitios/sepg/es-ES/Presupuestos/Documentacion/Documents/DOCUMENTOS%20DE%20TRABAJO/D95006.pdf
L<U+00F3>pez-de-Lacalle, J. (2014). ‘Structural Time Series Models’. R package version 1.2. https://CRAN.R-project.org/package=stsm
Kaiser, R., and Maravall, A. (1999). Seasonal Outliers in Time Series. Banco de Espa<U+00F1>a, Servicio de Estudios. Working paper number 9915.
locate.outliers.oloop
,
locate.outliers.iloop
,
outliers.tstatistics
,
tso
.
# NOT RUN {
data("hicp")
y <- log(hicp[["011600"]])
fit <- arima(y, order = c(1, 1, 0), seasonal = list(order = c(2, 0, 2)))
resid <- residuals(fit)
pars <- coefs2poly(fit)
outliers <- locate.outliers(resid, pars)
outliers
# }
Run the code above in your browser using DataLab