Learn R Programming

smooth (version 1.5.1)

auto.ssarima: State-Space ARIMA

Description

Function selects the best State-Space ARIMA based on information criteria, using fancy branch and bound mechanism. The resulting model can be not optimal in IC meaning, but it is usually reasonable.

Usage

auto.ssarima(data, orders=list(ar=c(3,3),i=c(2,1),ma=c(3,3)), lags=c(1,frequency(data)), combine=FALSE, workFast=TRUE, initial=c("backcasting","optimal"), ic=c("AICc","AIC","BIC"), cfType=c("MSE","MAE","HAM","MLSTFE","MSTFE","MSEh"), h=10, holdout=FALSE, intervals=c("none","parametric","semiparametric","nonparametric"), level=0.95, intermittent=c("none","auto","fixed","croston","tsb"), bounds=c("admissible","none"), silent=c("none","all","graph","legend","output"), xreg=NULL, updateX=FALSE, ...)

Arguments

data
Data that needs to be forecasted.
orders
List of maximum orders to check, containing vector variables ar, i and ma. If a variable is not provided in the list, then it is assumed to be equal to zero. At least one variable should have the same length as lags.
lags
Defines lags for the corresponding orders (see examples). The length of lags must correspond to the length of either ar.orders or i.orders or ma.orders. There is no restrictions on the length of lags vector.
combine
If TRUE, then resulting ARIMA is combined using AIC weights.
workFast
If TRUE, then some of the orders of ARIMA are skipped. This is not advised for models with lags greater than 12.
initial
Character value which defines how the model is initialised: it can be "optimal", meaning that the initial states are optimised, or "backcasting", meaning that the initials are produced using backcasting procedure.
ic
Information criterion to use in model selection.
cfType
Type of Cost Function used in optimization. cfType can be: MSE (Mean Squared Error), MAE (Mean Absolute Error), HAM (Half Absolute Moment), MLSTFE - Mean Log Squared Trace Forecast Error, MSTFE - Mean Squared Trace Forecast Error and MSEh - optimisation using only h-steps ahead error. If cfType!="MSE", then likelihood and model selection is done based on equivalent MSE. Model selection in this cases becomes not optimal.

There are also available analytical approximations for multistep functions: aMSEh, aMSTFE and aMLSTFE. These can be useful in cases of small samples.

h
The forecasting horizon.
holdout
If TRUE, the holdout sample of size h will be taken from the data. If FALSE, no holdout is defined.
intervals
Type of intervals to construct. This can be:

  • none, aka n - do not produce prediction intervals.

  • parametric, p - use state-space structure of ETS. In case of mixed models this is done using simulations, which may take longer time than for the pure additive and pure multiplicative models.
  • semiparametric, sp - intervals based on covariance matrix of 1 to h steps ahead errors and assumption of normal / log-normal distribution (depending on error type).
  • nonparametric, np - intervals based on values from a quantile regression on error matrix (see Taylor and Bunn, 1999). The model used in this process is e[j] = a j^b, where j=1,..,h.
  • The parameter also accepts TRUE and FALSE. Former means that parametric intervals are constructed, while latter is equivalent to none.

    level
    Confidence level. Defines width of prediction interval.
    intermittent
    Defines type of intermittent model used. Can be: 1. none, meaning that the data should be considered as non-intermittent; 2. fixed, taking into account constant Bernoulli distribution of demand occurancies; 3. croston, based on Croston, 1972 method with SBA correction; 4. tsb, based on Teunter et al., 2011 method. 5. auto - automatic selection of intermittency type based on data. The first letter can be used instead of the full words.
    bounds
    What type of bounds to use for the smoothing parameters. The first letter can be used instead of the whole word.
    silent
    If silent="none", then nothing is silent, everything is printed out and drawn. silent="all" means that nothing is produced or drawn (except for warnings). In case of silent="graph", no graph is produced. If silent="legend", then legend of the graph is skipped. And finally silent="output" means that nothing is printed out in the console, but the graph is produced. silent also accepts TRUE and FALSE. In this case silent=TRUE is equivalent to silent="all", while silent=FALSE is equivalent to silent="none". The parameter also accepts first letter of words ("n", "a", "g", "l", "o").
    xreg
    Vector (either numeric or time series) or matrix (or data.frame) of exogenous variables that should be included in the model. If matrix included than columns should contain variables and rows - observations. Note that xreg should have number of observations equal either to in-sample or to the whole series. If the number of observations in xreg is equal to in-sample, then values for the holdout sample are produced using Naive.
    updateX
    If TRUE, transition matrix for exogenous variables is estimated, introducing non-linear interractions between parameters. Prerequisite - non-NULL xreg.
    ...
    Other non-documented parameters. For example FI=TRUE will make the function also produce Fisher Information matrix, which then can be used to calculated variances of parameters of the model. Maximum orders to check can also be specified separately, however orders variable must be set to NULL: ar.orders - Maximum order of AR term. Can be vector, defining max orders of AR, SAR etc. i.orders - Maximum order of I. Can be vector, defining max orders of I, SI etc. ma.orders - Maximum order of MA term. Can be vector, defining max orders of MA, SMA etc.

    Value

    Object of class "smooth" is returned. See ssarima for details.

    Details

    The function constructs bunch of ARIMAs in Single Source of Error State-space form (see ssarima documentation) and selects the best one based on information criterion.

    Due to the flexibility of the model, multiple seasonalities can be used. For example, something crazy like this can be constructed: SARIMA(1,1,1)(0,1,1)[24](2,0,1)[24*7](0,0,1)[24*30], but the estimation may take a lot of time...

    References

    1. Hyndman, R.J., Koehler, A.B., Ord, J.K., and Snyder, R.D. (2008) Forecasting with exponential smoothing: the state space approach, Springer-Verlag. http://www.exponentialsmoothing.net.

    See Also

    ets, es, ces, sim.es, ges, ssarima

    Examples

    Run this code
    x <- rnorm(118,100,3)
    
    # The best ARIMA for the data
    ourModel <- auto.ssarima(x,orders=list(ar=c(2,1),i=c(1,1),ma=c(2,1)),lags=c(1,12),
                         h=18,holdout=TRUE,intervals="np")
    
    # The other one using optimised states
    ## Not run: auto.ssarima(x,orders=list(ar=c(3,2),i=c(2,1),ma=c(3,2)),lags=c(1,12),
    #                      initial="o",h=18,holdout=TRUE)## End(Not run)
    
    # And now combined ARIMA
    ## Not run: auto.ssarima(x,orders=list(ar=c(3,2),i=c(2,1),ma=c(3,2)),lags=c(1,12),
    #                       combine=TRUE,h=18,holdout=TRUE)## End(Not run)
    
    summary(ourModel)
    forecast(ourModel)
    plot(forecast(ourModel))
    
    

    Run the code above in your browser using DataLab