tso: Automatic Procedure for Detection of Outliers

Description

These functions are the interface to the automatic detection procedure provided in this package.

Usage

tso(y, xreg = NULL, cval = NULL, delta = 0.7, 
  types = c("AO", "LS", "TC"), 
  maxit = 1, maxit.iloop = 4, maxit.oloop = 4, cval.reduce = 0.14286, 
  discard.method = c("en-masse", "bottom-up"), discard.cval = NULL, 
  remove.method, remove.cval,
  tsmethod = c("auto.arima", "arima"), 
  args.tsmethod = NULL, logfile = NULL, check.rank = FALSE)
tso0(x, xreg = NULL, cval = 3.5, delta = 0.7, 
  types = c("AO", "LS", "TC"), maxit.iloop = 4, maxit.oloop = 4,
  discard.method = c("en-masse", "bottom-up"), discard.cval = NULL,   
  tsmethod = c("auto.arima", "arima"), args.tsmethod = NULL, 
  logfile = NULL, check.rank = FALSE)

Value

A list of class tsoutliers.

Arguments

y: a time series where outliers are to be detected.
x: a time series object.
xreg: an optional matrix of regressors with the same number of rows as y.
cval: a numeric. The critical value to determine the significance of each type of outlier.
delta: a numeric. Parameter of the temporary change type of outlier.
types: a character vector indicating the type of outlier to be considered by the detection procedure: innovational outliers ("IO"), additive outliers ("AO"), level shifts ("LS"), temporary changes ("TC") and seasonal level shifts ("SLS").
maxit: a numeric. The maximum number of iterations.
maxit.iloop: a numeric. The maximum number of iterations in the inner loop. See locate.outliers.
maxit.oloop: a numeric. The maximum number of iterations in the outer loop.
cval.reduce: a numeric. Factor by which cval is reduced if the procedure is run on the adjusted series, if maxit > 1.
discard.method: a character. The method used in the second stage of the procedure. See discard.outliers.
discard.cval: a numeric. The critical value to determine the significance of each type of outlier in the second stage of the procedure (discard outliers). By default it is set equal to cval. See details.
remove.method: deprecated, argument discard.method should be used.
remove.cval: deprecated, argument discard.cval should be used.
tsmethod: a character. The framework for time series modelling. It basically is the name of the function to which the arguments defined in args.tsmethod are referred to.
args.tsmethod: an optional list containing arguments to be passed to the function invoking the method selected in tsmethod.

logfile: a character or NULL. It is the path to the file where tracking information is printed. Ignored if NULL.
check.rank: logical. If TRUE the regressors are checked for perfect collinearity. The variables related to coefficients that turn out to be NA due to possible perfect collinearity are discarded.

Details

Five types of outliers can be considered. By default: "AO" additive outliers, "LS" level shifts, and "TC" temporary changes are selected; "IO" innovative outliers and "SLS" seasonal level shifts can also be selected.

tso0 is mostly a wrapper function around the functions locate.outliers and discard.outliers.

tso iterates around tso0 first for the original series and then for the adjusted series. The process stops if no additional outliers are found in the current iteration or if maxit iterations are reached.

tso0 is an auxiliar function (it is the workhorse for tso but it is not intended to be called directly by the user, use tso(maxit = 1, ...) instead. tso0 does not check the arguments since they are assumed to be passed already checked by tso; the default value for cval is not based on the sample size. For the time being, tso0 is exported in the NAMESPACE since it is convenient for debugging.

If no value is specified for argument cval a default value based on the sample size is used. Let \(n\) be the number of observations. If \(n \le 50\) then cval is set equal to \(3.0\); If \(n \ge 450\) then cval is set equal to \(4.0\); otherwise cval is set equal to \(3 + 0.0025 * (n - 50)\).

If tsmethod is NULL, the following default arguments are used in the function selected in tsmethod: tsmethod = "auto.arima": allowdrift = FALSE, ic = "bic"; tsmethod = "arima" = order = c(0, 1, 1) seasonal = list(order = c(0, 1, 1)).

If args.tsmethod is NULL, the following lists are used by default, respectively for each method: auto.arima: list(allowdrift = FALSE, ic = "bic"); arima: list(order = c(0, 1, 1), seasonal = list(order = c(0, 1, 1))).

xreg must be a matrix with time series attributes, tsp, that must be the same same as tsp(x). Column names are also compulsory. If there is only one regressor it may still have non-null dimension, i.e. it must be a one-column matrix.

The external regressors (if any) should be defined in the argument xreg. However, they may be also defined as an element in args.tsmethod since this list is passed to function that fits the model. The function tso deals with this possibility and returns a warning if "xreg" is defined twice with different values. No checks are done in tso0.

If maxit = 1 the procedure is run only once on the original series. If maxit > 1 the procedure is run iteratively, first for the original series and then for the adjusted series. The critical value used for the adjusted series may be reduced by the factor cval.reduce, equal to \(0.14286\) by default. The new critical value is defined as \(cval * (1 - cval.reduce)\).

By default, the same critical value is used in the first stage of the procedure (location of outliers) and in the second stage (discard outliers). Under the framework of structural time series models I noticed that the default critical value based on the sample size is too high, since all the potential outliers located in the first stage were discarded in the second stage (even in simulated series with known location of outliers). In order to investigate this issue, the argument discard.cval has been added. In this way a different critical value can be used in the second stage. Alternatively, the argument discard.cval could be omitted and simply choose a lower critical value, cval, to be used in both stages. However, using the argument discard.cval is more convenient since it avoids locating too many outliers in the first stage. discard.cval is not affected by cval.reduce.

References

Chen, C. and Liu, Lon-Mu (1993). ‘Joint Estimation of Model Parameters and Outlier Effects in Time Series’. Journal of the American Statistical Association, 88(421), pp. 284-297.

Gómez, V. and Maravall, A. (1996). Programs TRAMO and SEATS. Instructions for the user. Banco de España, Servicio de Estudios. Working paper number 9628. http://www.bde.es/f/webbde/SES/Secciones/Publicaciones/PublicacionesSeriadas/DocumentosTrabajo/96/Fich/dt9628e.pdf

Gómez, V. and Taguas, D. (1995). Detección y Corrección Automática de Outliers con TRAMO: Una Aplicación al IPC de Bienes Industriales no Energéticos. Ministerio de Economía y Hacienda. Document number D-95006. http://www.sepg.pap.minhap.gob.es/sitios/sepg/es-ES/Presupuestos/Documentacion/Documents/DOCUMENTOS%20DE%20TRABAJO/D95006.pdf

Examples

Run this code

if (FALSE) {
data("hicp")
tso(y = log(hicp[[1]]))
}

Run the code above in your browser using DataLab