
ssarima(data, orders=list(ar=0,i=c(1),ma=c(1)), lags=c(1), constant=FALSE, AR=NULL, MA=NULL, initial=c("backcasting","optimal"), ic=c("AICc","AIC","BIC"), cfType=c("MSE","MAE","HAM","MLSTFE","MSTFE","MSEh"), h=10, holdout=FALSE, intervals=c("none","parametric","semiparametric","nonparametric"), level=0.95, intermittent=c("none","auto","fixed","croston","tsb","sba"), bounds=c("admissible","none"), silent=c("none","all","graph","legend","output"), xreg=NULL, xregDo=c("use","select"), initialX=NULL, updateX=FALSE, persistenceX=NULL, transitionX=NULL, ...)
ar
, i
and ma
. Example: orders=list(ar=c(1,2),i=c(1),ma=c(1,1,1))
. If a variable is not provided in the list, then it is assumed to be equal to zero. At least one variable should have the same length as lags
.
lags
must correspond to the length of either ar
, i
or ma
in orders
variable. There is no restrictions on the length of lags
vector. It is recommended to order lags
ascending.
TRUE
, constant term is included in the model. Can also be a number (constant value).
"optimal"
, meaning that the initial states are optimised, or "backcasting"
, meaning that the initials are produced using backcasting procedure (advised for data with high frequency). If character, then initial.season
will be estimated in the way defined by initial
.
cfType
can be: MSE
(Mean Squared Error), MAE
(Mean Absolute Error), HAM
(Half Absolute Moment), MLSTFE
- Mean Log Squared Trace Forecast Error, MSTFE
- Mean Squared Trace Forecast Error and MSEh
- optimisation using only h-steps ahead error. If cfType!="MSE"
, then likelihood and model selection is done based on equivalent MSE
. Model selection in this cases becomes not optimal. There are also available analytical approximations for multistep functions: aMSEh
, aMSTFE
and aMLSTFE
. These can be useful in cases of small samples.
TRUE
, the holdout of the size h
is taken from the end of the data.
none
, aka n
- do not produce prediction intervals.
parametric
, p
- use state-space structure of ETS. In case of mixed models this is done using simulations, which may take longer time than for the pure additive and pure multiplicative models.
semiparametric
, sp
- intervals based on covariance matrix of 1 to h steps ahead errors and assumption of normal / log-normal distribution (depending on error type).
nonparametric
, np
- intervals based on values from a quantile regression on error matrix (see Taylor and Bunn, 1999). The model used in this process is e[j] = a j^b, where j=1,..,h.
The parameter also accepts TRUE
and FALSE
. Former means that parametric intervals are constructed, while latter is equivalent to none
.
none
, meaning that the data should be considered as non-intermittent; 2. fixed
, taking into account constant Bernoulli distribution of demand occurancies; 3. croston
, based on Croston, 1972 method with SBA correction; 4. tsb
, based on Teunter et al., 2011 method. 5. auto
- automatic selection of intermittency type based on information criteria. The first letter can be used instead. 6. "sba"
- Syntetos-Boylan Approximation for Croston's method (bias correction) discussed in Syntetos and Boylan, 2005.
silent="none"
, then nothing is silent, everything is printed out and drawn. silent="all"
means that nothing is produced or drawn (except for warnings). In case of silent="graph"
, no graph is produced. If silent="legend"
, then legend of the graph is skipped. And finally silent="output"
means that nothing is printed out in the console, but the graph is produced. silent
also accepts TRUE
and FALSE
. In this case silent=TRUE
is equivalent to silent="all"
, while silent=FALSE
is equivalent to silent="none"
. The parameter also accepts first letter of words ("n", "a", "g", "l", "o").
xreg
should have number of observations equal either to in-sample or to the whole series. If the number of observations in xreg
is equal to in-sample, then values for the holdout sample are produced using Naive.
"use"
means that all of the data should be used, whilie "select"
means that a selection using ic
should be done. "combine"
will be available at some point in future...
xreg
is NULL.
TRUE
, transition matrix for exogenous variables is estimated, introducing non-linear interractions between parameters. Prerequisite - non-NULL xreg
.
NULL
, then estimated. Prerequisite - non-NULL xreg
.
matrix(transition,nc,nc)
, where nc
is number of components in state vector. If NULL
, then estimated. Prerequisite - non-NULL xreg
.
Vectors of orders can be passed here using ar.orders
, i.orders
and ma.orders
. orders
variable needs to be NULL in this case.
Parameter model
can accept a previously estimated SSARIMA model and use all its parameters.
FI=TRUE
will make the function produce Fisher Information matrix, which then can be used to calculated variances of parameters of the model.
model
- the name of the estimated model.
timeElapsed
- time elapsed for the construction of the model.
states
- the matrix of the fuzzy components of ssarima, where rows
correspond to time and cols
to states.
transition
- matrix F.
persistence
- the persistence vector. This is the place, where smoothing parameters live.
AR
- the matrix of coefficients of AR terms.
I
- the matrix of coefficients of I terms.
MA
- the matrix of coefficients of MA terms.
constant
- the value of the constant term.
initialType
- Typetof initial values used.
initial
- the initial values of the state vector (extracted from states
).
nParam
- number of estimated parameters.
fitted
- the fitted values of ETS.
forecast
- the point forecast of ETS.
lower
- the lower bound of prediction interval. When intervals="none"
then NA is returned.
upper
- the higher bound of prediction interval. When intervals="none"
then NA is returned.
residuals
- the residuals of the estimated model.
errors
- The matrix of 1 to h steps ahead errors.
s2
- variance of the residuals (taking degrees of freedom into account).
intervals
- type of intervals asked by user.
level
- confidence level for intervals.
actuals
- the original data.
holdout
- the holdout part of the original data.
iprob
- the fitted and forecasted values of the probability of demand occurrence.
intermittent
- type of intermittent model fitted to the data.
xreg
- provided vector or matrix of exogenous variables. If xregDo="s"
, then this value will contain only selected exogenous variables.
updateX
- boolean, defining, if the states of exogenous variables were estimated as well.
initialX
- initial values for parameters of exogenous variables.
persistenceX
- persistence vector g for exogenous variables.
transitionX
- transition matrix F for exogenous variables.
ICs
- values of information criteria of the model. Includes AIC, AICc and BIC.
logLik
- log-likelihood of the function.
cf
- Cost function value.
cfType
- Type of cost function used in the estimation.
FI
- Fisher Information. Equal to NULL if FI=FALSE
or when FI
is not provided at all.
accuracy
- the vector or accuracy measures for the holdout sample. Includes MPE, MAPE, SMAPE, MASE, MAE/mean, RelMAE and Bias coefficient (based on complex numbers). Available only when holdout=TRUE
.
This model is then transformed into ARIMA in the Single Source of Error State-space form (proposed in Snyder, 1985):
$y_[t] = o_[t] (w' v_[t-l] + x_t a_[t-1] + \epsilon_[t])$
$v_[t] = F v_[t-1] + g \epsilon_[t]$
$a_[t] = F_[X] a_[t-1] + g_[X] \epsilon_[t] / x_[t]$
where $o_[t]$ is Bernoulli distributed random variable (in case of normal data equals to 1 for all observations), $v_[t]$ is a state vector (defined using ar.orders
and i.orders
), $x_t$ vector of exogenous parameters.
Due to the flexibility of the model, multiple seasonalities can be used. For example, something crazy like this can be constructed: SARIMA(1,1,1)(0,1,1)[24](2,0,1)[24*7](0,0,1)[24*30], but the estimation may take a lot of time...
auto.arima, orders, lags, sim.ssarima
# ARIMA(1,1,1) fitted to some data
ourModel <- ssarima(rnorm(118,100,3),orders=list(ar=c(1),i=c(1),ma=c(1)),lags=c(1),h=18,
holdout=TRUE,intervals="p")
# The previous one is equivalent to:
## Not run: ourModel <- ssarima(rnorm(118,100,3),ar.orders=c(1),i.orders=c(1),ma.orders=c(1),lags=c(1),h=18,
# holdout=TRUE,intervals="p")## End(Not run)
# Model with the same lags and orders, applied to a different data
ssarima(rnorm(118,100,3),orders=orders(ourModel),lags=lags(ourModel),h=18,holdout=TRUE)
# The same model applied to a different data
ssarima(rnorm(118,100,3),model=ourModel,h=18,holdout=TRUE)
# Example of SARIMA(2,0,0)(1,0,0)[4]
## Not run: ssarima(rnorm(118,100,3),orders=list(ar=c(2,1)),lags=c(1,4),h=18,holdout=TRUE)
# SARIMA(1,1,1)(0,0,1)[4] with different initialisations
## Not run: ssarima(rnorm(118,100,3),orders=list(ar=c(1),i=c(1),ma=c(1,1)),
# lags=c(1,4),h=18,holdout=TRUE)
# ssarima(rnorm(118,100,3),orders=list(ar=c(1),i=c(1),ma=c(1,1)),
# lags=c(1,4),h=18,holdout=TRUE,initial="o")## End(Not run)
# SARIMA of a perculiar order on AirPassengers data
## Not run: ssarima(AirPassengers,orders=list(ar=c(1,0,3),i=c(1,0,1),ma=c(0,1,2)),lags=c(1,6,12),
# h=10,holdout=TRUE)## End(Not run)
# ARIMA(1,1,1) with Mean Squared Trace Forecast Error
## Not run: ssarima(rnorm(118,100,3),orders=list(ar=1,i=1,ma=1),lags=1,h=18,holdout=TRUE,cfType="MSTFE")
# ssarima(rnorm(118,100,3),orders=list(ar=1,i=1,ma=1),lags=1,h=18,holdout=TRUE,cfType="aMSTFE")## End(Not run)
# SARIMA(0,1,1) with exogenous variables
ssarima(rnorm(118,100,3),orders=list(i=1,ma=1),h=18,holdout=TRUE,xreg=c(1:118))
# SARIMA(0,1,1) with exogenous variables with crazy estimation of xreg
## Not run: ourModel <- ssarima(rnorm(118,100,3),orders=list(i=1,ma=1),h=18,holdout=TRUE,
# xreg=c(1:118),updateX=TRUE)## End(Not run)
summary(ourModel)
forecast(ourModel)
plot(forecast(ourModel))
Run the code above in your browser using DataLab