Simultaneously estimate the trend and the seasonality via locally weighted regression in an equidistant time series under short memory. The default setting uses an iterative plug-in algorithm for identifying the asymptotically globally optimal bandwidth for smoothing.
deseats(
y,
smoothing_options = set_options(),
bwidth_start = NULL,
inflation_rate = c("optimal", "naive"),
correction_factor = FALSE,
autocor = TRUE,
drop = NULL,
error_model = c("free", "ARMA"),
nar_lim = c(0, 3),
nma_lim = c(0, 3),
arma_mean = FALSE
)The function returns and S4 object with the following elements (access them
via @):
boundary_methodthe applied boundary method.
bwidththe globally applied bandwidth in the smoothing process; if not if no input is given in the function call, this is the automatically selected optimal bandwidth.
decompAn object of class "mts" that consists of the
decomposed time series data.
frequencythe frequency of the time series.
kernel_funthe second-order kernel function considered for weighting.
order_polythe order of polynomial considered locally for the trend.
order_polythe order of polynomial considered locally for the trend.
sum_autocovthe estimated sum of autocovariances.
ts_namethe object name of the initially provided time series object.
weights_wfuna matrix that gives the weights of the weighting function \(K\) at each estimation time point; ; if \(n\) is the length of the given time series and \(b\) is the applied (relative) bandwidth, then the first row of the weighting system gives the weighting function weights when estimating at \(t=1\), the second row gives the weights when estimating at \(t=2\) and so on for all left-hand side boundary points until the middle row, which contains the weights used at all interior points; the rows following the middle row contain the weights for right-hand side boundary points (the rows are ordered chronologically)
weightsan array with many slices that represent the weighting
systems for various filters; each slice is a matrix, which gives the weighting
system to estimate a component, for example trend + seasonality, as a weighted
average from the given time series; if
\(n\) is the length of the given time series and \(b\) is the applied
(relative) bandwidth, then the first row of the weighting system gives the
weights to obtain estimates at \(t=1\), the second row gives the weights to
obtain estimates at \(t=2\) and so on for all left-hand side boundary points
until the middle row, which contains the
weights used at all interior points; the rows following the middle row contain
the weights for right-hand side boundary points (the rows are ordered
chronologically);
the slice names are "Trend", "Season" and "Combined",
where "Combined" are the weights to estimate trend + seasonality
combined.
a numerical vector or a time series object of class ts or
that can be transformed with as.ts to an object of class
ts; for these observations, trend and seasonality will be obtained.
an S4 object of class smoothing_options, which
is returned by the function set_options; it
includes details about the
options to consider in the locally weighted regression such as the order of
polynomial and the bandwidth for smoothing among others.
a single numeric value that is only relevant if the slot
bwidth in smoothing_options is set to NA;
as the bandwidth will then
be selected automatically, bwidth_start sets the initial bandwidth for
the algorithm; the default, bwidth_start = NULL, corresponds to
bwidth_start = 0.1 for a local linear trend and to
bwidth_start = 0.2 for a local cubic trend.
a character vector of length one that indicates, which inflation rate
to use in the bandwidth selection; for a local linear trend, we have
inflation_rate = "optimal" as the default, for a local cubic trend
it is inflation_rate = "naive", which correspond to inflation rates
of 5/7 and 9/13, respectively.
A logical vector of length one; theoretically, a
larger bandwidth to estimate the sum of autocovariances from residuals of
pilot trend and seasonality estimates is advisable than for estimating trend
and seasonality; for correction_factor = TRUE, this is implemented;
for error_model = "ARMA", correction_factor = FALSE is
enforced; the default is correction_factor = FALSE, because it was
found that setting this argument to TRUE often overestimates the
bandwidth.
a logical vector of length one; indicates whether to consider
autocorrelated errors (TRUE) or independent but identically
distributed errors (FALSE); the default is autocor = TRUE.
a numeric vector of length one that indicates the proportion of
the observations to not include at each boundary in the bandwidth estimation
process, if a bandwidth is selected automatically; the default is
drop = NULL, which corresponds to drop = 0.05 for a
local linear trend and to drop = 0.1 for a local cubic trend.
a character vector of length one that indicates whether
for autocor = TRUE the sum of autocovariances of the errors is
obtained purely nonparametrically ("free") or whether an
autoregressive moving-average (ARMA) model is assumed "ARMA"; the
default is error_model = "free".
only valid for error_model = "ARMA"; set the minimum and
maximum AR order to check via the BIC in each iteration of the algorithm via
a two-element vector.
only valid for error_model = "ARMA"; set the minimum and
maximum MA order to check via the BIC in each iteration of the algorithm via
a two-element vector.
only valid for error_model = "ARMA"; decide whether to
include an estimate of the mean in the ARMA fitting for the detrended series.
Dominik Schulz (Research Assistant) (Department of Economics, Paderborn
University),
Author and Package Creator
Yuanhua Feng (Department of Economics, Paderborn
University),
Author
Trend and seasonality are estimated based on the additive nonparametric regression model for an equidistant time series $$y_t = m(x_t) + s(x_t) + \epsilon_t,$$ where \(y_t\) is the observed time series with \(t=1,...n\), \(x_t = t / n\) is the rescaled time on the interval \([0, 1]\), \(m(x_t)\) is a smooth and deterministic trend function, \(s(x_t)\) is a (slowly changing) seasonal component with seasonal period \(p_s\) and \(\epsilon_t\) are stationary errors with \(E(\epsilon_t) = 0\) and short-range dependence (see for example also Feng, 2013, for a similar model, where the stochastic term is however i.i.d.).
It is assumed that \(m\) and \(s\) can be approximated locally by a polynomial of small order and by a trigonometric polynomial, respectively. Through locally weighted regression, \(m\) and \(s\) can therefore be estimated given a suitable bandwidth.
The iterative-plug-in (IPI) algorithm, which numerically minimizes the asymptotic mean squared error (AMISE) to select a bandwidth is an extension of Feng (2013) to the case with short-range dependence in the errors. To achieve this goal, the error variance in the AMISE in Feng (2013) is replaced by the sum of autocovariances of the error process and this quantity is being estimated using a slightly adjusted version of the Bühlmann (1996) algorithm. This procedure is similar to the method described in Feng, Gries and Fritz (2020), where data-driven local polynomial regression with an automatically selected bandwidth is used to estimate the trend according to a model without seasonality and where the same adjusted Bühlmann (1996) algorithm is considered to estimate the sum of autocovariances in the error process.
Define \(I[m^{(k)}] = \int_{c_b}^{d_b} [m^{(k)}(x)]^2 dx\), \(\beta_{(k)} = \int_{-1}^{1} u^k K(u) du\) and \(R(K) = \int_{-1}^{1} K^{2}(u) du\), where \(p\) is the order of the (local) polynomial considered for \(m\), \(k = p + 1\) is the order of the asymptotically equivalent kernel \(K\) for estimating \(m\), \(0 \leq c_{b}< d_{b} \leq 1\), and \(c_f\) is the variance factor, i.e. the sum of autocovariances divided by \(2\pi\).
Furthermore, we define $$C_{1} = \frac{I[m^{(k)}] \beta_{(k)}^2}{(k!)^2}$$ and $$C_{2} = \frac{2 \pi c_{f} (d_b - c_b)[R(K) + (p_s - 1) R(W)]}{nh}$$ with \(h\) being the bandwidth, \(n\) being the number of observations and \(W\) being the weighting function considered in the weighted least squares approach, for example a second-order kernel function with support on \([-1,1]\). The AMISE is then $$AMISE(h) = h^{2k}C_{1} + C_{2}.$$
The function calculates suitable estimates for \(c_f\), the variance factor, and \(I[m^{(k)}]\) over different iterations. In each iteration, a bandwidth is obtained in accordance with the AMISE that once more serves as an input for the following iteration. The process repeats until either convergence or the 40th iteration is reached. For further details on the asymptotic theory or the algorithm, please consult Feng, Gries and Fritz (2020) or Feng et al. (2019).
To apply the function, only few arguments are needed: a data input y,
an object with smoothing options smoothing_options returned by
set_options and
a starting value for the relative bandwidth
bwidth_start. Aside from y, each argument has a default setting.
By default, a local linear trend is considered. In some cases, a local cubic
trend may, however, be more suitable. For more specific information on the input arguments
consult the section Arguments.
When applying the function, an optimal bandwidth is obtained based on the IPI algorithm proposed by Feng, Gries and Fritz (2020). In a second step, the nonparametric trend of the series and the seasonality are calculated with respect to the chosen bandwidth.
Note that with this function \(m(x_t)\) and \(s(x_t)\) can be
estimated without a parametric
model assumption for the error series. Thus, after estimating and removing
the trend and the seasonality, any suitable parametric model, e.g. an
ARMA(\(p,q\)) model for errors = "autocor", can be fitted to the
residuals (see arima).
Usually, a local cubic trend (smoothing_options = set_options(order_poly = 3))
gives more suitable results. Moreover, if the resulting bandwidth is too large,
adjustments to the arguments inflation_rate and drop should be
tried first in that order before any other changes
to the input arguments.
The default print method for this function delivers key numbers such as the bandwidth considered for smoothing.
NOTE:
This function implements C++ code by means
of the Rcpp and
RcppArmadillo packages for
better performance.
Bühlmann, P. (1996). Locally Adaptive Lag-Window Spectral Estimation. Journal of Time Series Analysis, 17(3): 247-270. DOI: 10.1111/j.1467-9892.1996.tb00275.x.
Feng, Y. (2013). An iterative plug-in algorithm for decomposing seasonal time series using the Berlin Method. Journal of Applied Statistics, 40(2): 266-281. DOI: 10.1080/02664763.2012.740626.
Feng, Y., Gries. T, and Fritz, M. (2020). Data-driven local polynomial for the trend and its derivatives in economic time series. Journal of Nonparametric Statistics, 32(2): 510-533. DOI: 10.1080/10485252.2020.1759598.
# \donttest{
Xt <- log(EXPENDITURES)
smoothing_options <- set_options(order_poly = 3)
est <- deseats(Xt, smoothing_options = smoothing_options)
est
plot(est, which = 1)
# }
Run the code above in your browser using DataLab