Function constructs Generalised Univariate Model, estimating matrices F, w, vector g and initial parameters.
gum(y, orders = c(1, 1), lags = c(1, frequency(y)), type = c("additive",
"multiplicative"), initial = c("backcasting", "optimal", "two-stage",
"complete"), persistence = NULL, transition = NULL,
measurement = rep(1, sum(orders)), loss = c("likelihood", "MSE", "MAE",
"HAM", "MSEh", "TMSE", "GTMSE", "MSCE"), h = 0, holdout = FALSE,
bounds = c("admissible", "none"), silent = TRUE, model = NULL,
xreg = NULL, regressors = c("use", "select", "adapt", "integrate"),
initialX = NULL, ...)auto.gum(y, orders = 3, lags = frequency(y), type = c("additive",
"multiplicative", "select"), initial = c("backcasting", "optimal",
"two-stage", "complete"), ic = c("AICc", "AIC", "BIC", "BICc"),
loss = c("likelihood", "MSE", "MAE", "HAM", "MSEh", "TMSE", "GTMSE",
"MSCE"), h = 0, holdout = FALSE, bounds = c("admissible", "none"),
silent = TRUE, xreg = NULL, regressors = c("use", "select", "adapt",
"integrate"), ...)
gum_old(y, orders = c(1, 1), lags = c(1, frequency(y)),
type = c("additive", "multiplicative"), persistence = NULL,
transition = NULL, measurement = rep(1, sum(orders)),
initial = c("optimal", "backcasting"), loss = c("likelihood", "MSE",
"MAE", "HAM", "MSEh", "TMSE", "GTMSE", "MSCE"), h = 10, holdout = FALSE,
bounds = c("restricted", "admissible", "none"), silent = c("all",
"graph", "legend", "output", "none"), xreg = NULL, regressors = c("use",
"select"), initialX = NULL, ...)
ges(...)
Object of class "adam" is returned with similar elements to the adam function.
Vector or ts object, containing data needed to be forecasted.
Order of the model. Specified as vector of number of states
with different lags. For example, orders=c(1,1)
means that there are
two states: one of the first lag type, the second of the second type.
In case of auto.gum()
, this parameters is the value of the max order
to check.
Defines lags for the corresponding orders. If, for example,
orders=c(1,1)
and lags are defined as lags=c(1,12)
, then the
model will have two states: the first will have lag 1 and the second will
have lag 12. The length of lags
must correspond to the length of
orders
. In case of the auto.gum()
, the value of the maximum
lag to check. This should usually be a maximum frequency of the data.
Type of model. Can either be "additive"
or
"multiplicative"
. The latter means that the GUM is fitted on
log-transformed data. In case of auto.gum()
, can also be "select"
,
implying automatic selection of the type.
Can be either character or a list, or a vector of initial states.
If it is character, then it can be "backcasting"
, meaning that the initials of
dynamic part of the model are produced using backcasting procedure (advised
for data with high frequency), or "optimal"
, meaning that all initial
states are optimised, or "two-stage"
, meaning that optimisation is done
after the backcasting, refining the states. In case of backcasting, the parameters of the
explanatory variables are optimised. Alternatively, you can set initial="complete"
backcasting, which means that all states (including explanatory variables) are initialised
via backcasting.
Persistence vector \(g\), containing smoothing
parameters. If NULL
, then estimated.
Transition matrix \(F\). Can be provided as a vector.
Matrix will be formed using the default matrix(transition,nc,nc)
,
where nc
is the number of components in the state vector. If
NULL
, then estimated.
Measurement vector \(w\). If NULL
, then
estimated.
The type of Loss Function used in optimization. loss
can
be: likelihood
(assuming Normal distribution of error term),
MSE
(Mean Squared Error), MAE
(Mean Absolute Error),
HAM
(Half Absolute Moment), TMSE
- Trace Mean Squared Error,
GTMSE
- Geometric Trace Mean Squared Error, MSEh
- optimisation
using only h-steps ahead error, MSCE
- Mean Squared Cumulative Error.
If loss!="MSE"
, then likelihood and model selection is done based
on equivalent MSE
. Model selection in this cases becomes not optimal.
There are also available analytical approximations for multistep functions:
aMSEh
, aTMSE
and aGTMSE
. These can be useful in cases
of small samples.
Finally, just for fun the absolute and half analogues of multistep estimators
are available: MAEh
, TMAE
, GTMAE
, MACE
, TMAE
,
HAMh
, THAM
, GTHAM
, CHAM
.
Length of forecasting horizon.
If TRUE
, holdout sample of size h
is taken from
the end of the data.
The type of bounds for the parameters to use in the model
estimation. Can be either admissible
- guaranteeing the stability of the
model, or none
- no restrictions (potentially dangerous).
accepts TRUE
and FALSE
. If FALSE, the function
will print its progress and produce a plot at the end.
A previously estimated GUM model, if provided, the function will not estimate anything and will use all its parameters.
The vector (either numeric or time series) or the matrix (or
data.frame) of exogenous variables that should be included in the model. If
matrix included than columns should contain variables and rows - observations.
Note that xreg
should have number of observations equal either to
in-sample or to the whole series. If the number of observations in
xreg
is equal to in-sample, then values for the holdout sample are
produced using es function.
The variable defines what to do with the provided xreg:
"use"
means that all of the data should be used, while
"select"
means that a selection using ic
should be done.
The vector of initial parameters for exogenous variables.
Ignored if xreg
is NULL.
Other non-documented parameters. See adam for
details. However, there are several unique parameters passed to the optimiser
in comparison with adam
:
1. algorithm0
, which defines what algorithm to use in nloptr for the initial
optimisation. By default, this is "NLOPT_LN_BOBYQA".
2. algorithm
determines the second optimiser. By default this is
"NLOPT_LN_NELDERMEAD".
3. maxeval0 and maxeval, that determine the number of iterations for the two
optimisers. By default, maxeval0=maxeval=40*k
, where
k is the number of estimated parameters.
4. xtol_rel0 and xtol_rel, which are 1e-8 and 1e-6 respectively.
There are also ftol_rel0, ftol_rel, ftol_abs0 and ftol_abs, which by default
are set to values explained in the nloptr.print.options()
function.
The information criterion used in the model selection procedure.
Ivan Svetunkov, ivan@svetunkov.com
The function estimates the Single Source of Error state space model of the following type:
$$y_{t} = w_t' v_{t-l} + \epsilon_{t}$$
$$v_{t} = F v_{t-l} + g_{t} \epsilon_{t}$$
where \(v_{t}\) is the state vector (defined using orders
) and
\(l\) is the vector of lags
, \(w_t\) is the measurement
vector (which includes fixed elements and explanatory variables),
\(F\) is the transition
matrix, \(g_t\) is the persistence
vector (includes explanatory variables as well if provided), finally,
\(\epsilon_{t}\) is the error term.
For some more information about the model and its implementation, see the
vignette: vignette("gum","smooth")
Svetunkov I. (2023) Smooth forecasting with the smooth package in R. arXiv:2301.01790. tools:::Rd_expr_doi("10.48550/arXiv.2301.01790").
Svetunkov I. (2015 - Inf) "smooth" package for R - series of posts about the underlying models and how to use them: https://openforecast.org/category/r-en/smooth/.
Svetunkov, I., 2023. Smooth Forecasting with the Smooth Package in R. arXiv. tools:::Rd_expr_doi("10.48550/arXiv.2301.01790")
Snyder, R. D., 1985. Recursive Estimation of Dynamic Linear Models. Journal of the Royal Statistical Society, Series B (Methodological) 47 (2), 272-276.
Hyndman, R.J., Koehler, A.B., Ord, J.K., and Snyder, R.D. (2008) Forecasting with exponential smoothing: the state space approach, Springer-Verlag. tools:::Rd_expr_doi("10.1007/978-3-540-71918-2").
adam, es, ces
gum, es,
ces, sim.es, ssarima
gum(BJsales, h=8, holdout=TRUE)
ourModel <- gum(rnorm(118,100,3), orders=c(2,1), lags=c(1,4), h=18, holdout=TRUE)
# Redo previous model on a new data and produce prediction interval
gum(rnorm(118,100,3), model=ourModel, h=18)
# Produce something crazy with optimal initials (not recommended)
gum(rnorm(118,100,3), orders=c(1,1,1), lags=c(1,3,5), h=18, holdout=TRUE, initial="o")
# Simpler model estimated using trace forecast error loss function and its analytical analogue
gum(rnorm(118,100,3), orders=c(1), lags=c(1), h=18, holdout=TRUE, bounds="n", loss="TMSE")
x <- rnorm(50,100,3)
# The best GUM model for the data
ourModel <- auto.gum(x, orders=2, lags=4, h=18, holdout=TRUE)
summary(ourModel)
Run the code above in your browser using DataLab