Fit a fractionally integrated GARCH (FIGARCH) model under the six most common and further conditional distributions to observed data using quasi maximum-likelihood estimation.
figarch(
rt,
orders = c(1, 1),
cond_dist = c("norm", "std", "ged", "ald", "snorm", "sstd", "sged", "sald"),
drange = c(0, 1),
meanspec = mean_spec(),
Drange = c(0, 1),
nonparspec = locpol_spec(),
use_nonpar = FALSE,
n_test = 0,
start_pars = NULL,
LB = NULL,
UB = NULL,
control = list(),
control_nonpar = list(),
mean_after_nonpar = FALSE,
parallel = TRUE,
ncores = max(1, future::availableCores() - 1),
trunc = "none",
presample = 50,
Prange = c(1, 5)
)
An object of S4-class "fEGarch_fit_figarch"
is returned. It contains the following elements.
pars
:a named numeric vector with the parameter estimates.
se
:a named numeric vector with the obtained standard errors in accordance with the parameter estimates.
vcov_mat
:the variance-covariance matrix of the parameter estimates with named columns and rows.
rt
:the input object rt
(or at least the training data, if n_test
is greater than zero);
if rt
was a "zoo"
object, the formatting is kept.
cmeans
:the estimated conditional means; if rt
was a "zoo"
or "ts"
object, the formatting is also applied to cmeans
.
sigt
:the estimated conditional standard deviations (or for use_nonpar = TRUE
the estimated total
volatilities, i.e. scale function value times conditional standard deviation); if rt
was a "zoo"
object, the formatting is also applied to sigt
.
etat
:the obtained residuals; if rt
was a "zoo"
object, the formatting is also applied to etat
.
orders
:a two-element numeric vector stating the considered model orders.
cond_dist
:a character value stating the conditional distribution considered in the model fitting.
long_memo
:a logical value stating whether or not long memory was considered in the model fitting.
llhood
:the log-likelihood value obtained at the optimal parameter combination.
inf_criteria
:a named two-element numeric vector with the corresponding AIC (first element) and BIC (second element) of the fitted parametric model part; for purely parametric models, these criteria are valid for the entire model; for semiparametric models, they are only valid for the parametric step and are not valid for the entire model.
meanspec
:the settings for the model in the conditional mean; is an object
of class "mean_spec"
that is identical to the object passed to the input argument
meanspec
.
test_obs
:the observations at the end up the input rt
reserved for
testing following n_test
.
scale_fun
:the estimated scale function values, if use_nonpar = TRUE
, otherwise
NULL
; formatting of rt
is reused.
nonpar_model
:the estimation object returned by tsmoothlm
for
use_nonpar = TRUE
.
trunc
:the input argument trunc
.
the observed series ordered from past to present; can be
a numeric vector or a "zoo"
class time series object.
a two-element numeric vector containing the two model
orders \(p\) and \(q\) (see Details for more information); currently,
only the default orders = c(1, 1)
is supported; other specifications
of a two-element numeric vector will lead to orders = c(1, 1)
being
run and a warning message being returned.
the conditional distribution to consider as a
character object; the default is a conditional normal distribution
"norm"
; available are also, however, a \(t\)-distribution
("std"
), a generalized error distribution ("ged"
),
an average Laplace distribution ("ald"
),
and their four skewed variants ("snorm"
, "sstd"
,
"sged"
, "sald"
).
a two-element numeric vector that gives the boundaries of the
search interval for the fractional differencing parameter \(d\) in the conditional
volatility model part; is
overwritten by the settings of the arguments LB
and UB
.
an object of class "mean_spec"; indicates the specifications for the model in the conditional mean.
a two-element numeric vector that indicates the boundaries
of the interval over which to search for the fractional differencing
parameter \(D\) in a long-memory ARMA-type model in the conditional mean
model part; by default,
\(D\) being searched for on the
interval from 0 to \(0.5 - 1\times 10^{-6}\); note that specific
settings in the arguments
LB
and UB
overwrite this argument.
an object of class "locpol_spec"
returned
by locpol_spec
; defines the settings of the nonparametric
smoothing technique for use_nonpar = TRUE
.
a logical indicating whether or not to implement a
semiparametric extension of the volatility model defined through spec
;
see "Details" for more information.
a single numerical value indicating, how many observations
at the end of rt
not to include in the fitting process and to
reserve for backtesting.
the starting parameters for the numerical optimization
routine; should be of the same length as the parameter output vector
within the output object (also keeping the same order); for NULL
,
an internally saved default set of values is used; see "Details" for the
order of elements; elements should be set with respect to a series rescaled
to have sample variance one.
the lower boundaries of the parameters in the numerical optimization
routine; should be of the same length as the parameter output vector
within the output object (also keeping the same order); for NULL
,
an internally saved default set of values is used; see "Details" for the
order of elements; elements should be set with respect to a series rescaled
to have sample variance one.
the upper boundaries of the parameters in the numerical optimization
routine; should be of the same length as the parameter output vector
within the output object (also keeping the same order); for NULL
,
an internally saved default set of values is used; see "Details" for the
order of elements; elements should be set with respect to a series rescaled
to have sample variance one.
a list that is passed to control
of the
function solnp
of the package Rsolnp
.
a list containing changes to the arguments
for the hyperparameter estimation algorithm in the nonparametric
scale function estimation for
use_nonpar = TRUE
; see "Details" for more information.
only for use_nonpar = TRUE
; considers the unconditional mean
of the parametric model part in the QMLE step in a semiparametric model; by default, a zero-mean
model is considered for the parametric part in a semiparametric model.
only relevant for a (skewed) average Laplace (AL)
distribution, i.e.
if cond_dist
in spec
is set to cond_dist = "ald"
or
cond_dist = "sald"
; parallel
is a logical value indicating whether
or not the slices for the positive integer-valued parameter of the SM
distribution should be fitted in parallel for a speed boost.
only relevant for a (skewed) average Laplace (AL)
distribution, i.e.
if cond_dist
in spec
is set to cond_dist = "ald"
or
cond_dist = "sald"
, and if simultaneously parallel = TRUE
;
ncores
is a single numeric value indicating the number of cores to
use for parallel computations.
a positive integer indicating the finite truncation length of the
infinite-order polynomials of the infinite-order representations of the
long-memory model parts; the character "none"
is an optional input
that specifies that truncation should always be applied back to the first (presample) observation
time point, i.e. that maximum length filters should be applied at all times.
the presample length for initialization (for extended EGARCH- / Log-GARCH-type models only relevant for the FARIMA-part, as series in log-transformed conditional variance are initialized by zero).
a two-element vector that indicates the search boundaries for the parameter \(P\) in a (skewed) average Laplace distribution.
Let \(\left\{r_t\right\}\), with \(t \in \mathbb{Z}\) as the
time index, be a theoretical time series that follows
$$r_t=\mu+\varepsilon_t \text{ with } \varepsilon_t=\sigma_t \eta_t \text{ and } \eta_t \sim \text{IID}(0,1), \text{ where}$$
$$\sigma_t^{2}=\omega+\left[1-\beta^{-1}(B)\phi(B)(1-B)^{d}\right]\varepsilon_t^2.$$
Here, \(\eta_t\sim\text{IID}(0,1)\) means that the innovations
\(\eta_t\) are independent and identically distributed (iid) with mean zero
and variance one, whereas \(\sigma_t > 0\) are the conditional standard
deviations in \(r_t\).
Moreover, \(B\) is the backshift operator and
\(\beta(B) = 1 - \sum_{j=1}^{q}\beta_j B^{j}\), where
\(\beta_j\), \(j=1,2,\dots, q\), are real-valued coefficients. Furthermore,
\(\phi(B) = 1 - \sum_{i=1}^{p}\phi_i B^{i}\), where
\(\phi_i\), \(i=1,2,\dots, p\), are real-valued coefficients. \(p\)
and \(q\) are the model orders definable through the argument orders
,
where \(p\) is the first element and \(q\) is the second element in the
argument. In addition, we have \(\mu = E\left(r_t\right)\) as a
real-valued parameter and \(d \in [0,1]\) as the
parameter for the level of integration. With \(d = 0\) the model reduces
to a short-memory GARCH, for \(d=1\) we have a full integration, and for
\(d\in(0, 1)\), we have fractional integration, where \(d\in(0, 0.5)\) is usually
considered to describe a long-memory process. \(\omega > 0\) is the intercept. It is assumed that
all \(\beta_j\) and \(\phi_i\) are non-negative. Furthermore, we have
\(\omega > 0\) as the intercept.
Currently, only a model of orders \(p=1\) with \(q=1\) can be fitted; to ensure the non-negativity of all of the infinite-order coefficient series \(\psi(B)\), which in combination with \(\omega>0\) ensures that all the conditional volatilities are greater than zero, we employ inequality constraints ensuring that the first 50 coefficients of the infinite-order ARCH-representation are non-negative as an approximation to ensuring that all of the coefficients are non-negative. To ensure that they are non-negative, one may in theory consider the sufficient conditions mentioned in Bollerslev and Mikkelsen (1996) or Tse (1998), which are however sometimes restrictive, or the simultaneously necessary and sufficient conditions by Conrad and Haag (2006), which are however complex to implement properly.
The truncated infinite order polynomial is computed following the idea by Nielsen and Noel (2021) as is the series of conditional variances for most computational efficiency. To ensure stability of the first fitted in-sample conditional standard deviations, we however use a small, but also adjustable (also to length zero) presample, which may introduce biases into the parameter estimators.
In the current package version, standard errors of parameter estimates are
computed from the Hessian at the optimum of the log-likelihood using
hessian
. To ensure numerical stability and
applicability to a huge variety of differently scaled data, parametric
models are first fitted to data that is scaled to have sample variance
1. Parameter estimates and other quantities are then either
retransformed or recalculated afterwards for the original data.
For a conditional average Laplace distribution, an optimal model for each
distribution parameter \(P\) from 1 to 5 is estimated (assuming that
\(P\) is then fixed to the corresponding value). Afterwards, \(P\) is then
estimated by selecting the estimated model among the five fitted models that
has the largest log-likelihood. The five models are, by default, fitted
simultaneously using parallel programming techniques (see also the arguments
parallel
and ncores
, which are only relevant for a conditional
average Laplace distribution). After the optimal model (including
the estimate of \(P\) called \(\hat{P}\)) has been determined, \(P=\hat{P}\)
is seen as fixed to obtain the standard errors via the Hessian matrix for the
estimates of the continuous parameters. A standard error for \(\hat{P}\) is therefore
not obtained and the ones obtained for the remaining estimates do not account
for \(\hat{P}\).
An ARMA-FIGARCH or a FARIMA-FIGARCH can be fitted by adjusting the
argument meanspec
correspondingly.
As an alternative, a semiparametric extension of the pure models
in the conditional variance can be implemented. If use_nonpar = TRUE
,
meanspec
is omitted and before fitting a zero-mean model in the
conditional volatility following the remaining function arguments, a smooth scale function,
i.e. a function representing the unconditional standard deviation over time,
is being estimated following the specifications in nonparspec
and
control_nonpar
. This preliminary step stabilizes the input
series rt
, as long-term changes in the unconditional variance
are being estimated and removed before the parametric step using
tsmoothlm
. control_nonpar
can be adjusted following
to make changes to the arguments of tsmoothlm
for long-memory specifications. These arguments specify settings
for the automated bandwidth selection algorithms implemented by this
function. By default, we use the settings
pmin = 0
, pmax = 1
, qmin = 0
,
qmax = 1
, InfR = "Nai"
,
bStart = 0.15
, cb = 0.05
, and
method = "lpr"
for tsmoothlm
.
locpol_spec
passed to nonparspec
handles
more direct settings of the local polynomial smoother itself. See
the documentation for these functions to get a detailed overview
of these settings. Assume \(\{r_t\}\) to be the observed series, where
\(t = 1, 2, \dots, n\),
then \(r_t^{*} = r_t - \bar{r}\), with \(\bar{r}\) being the arithmetic
mean over the observed \(r_t\), is computed and subsequently
\(y_t = \ln\left[\left(r_t^{*}\right)^2\right]\). The subtraction of
\(\bar{r}\) is necessary so that \(r_t^{*}\) are all different from zero
almost surely. Once \(y_t\) are available, its trend \(m(x_t)\),
with \(x_t\) as the rescaled time on the interval \([0, 1]\), is
being estimated using
tsmoothlm
and denoted here by
\(\hat{m}(x_t)\). Then from \(\hat{\xi}_t = y_t - \hat{m}(x_t)\)
obtain \(\hat{C} = -\ln\left\{\sum_{t=1}^{n}\exp\left(\hat{\xi}_t\right)\right\}\),
and obtain the estimated scale function as
\(\hat{s}(x_t)=\exp\left[\left(\hat{\mu}(x_t) - \hat{C}\right) / 2\right]\).
The stabilized / standardized version of the series \(\left\{r_t\right\}\)
is then \(\tilde{r}_t = r_t^{*} / \hat{s}(x_t)\), to which
a purely parametric volatility model following the remaining function arguments
is then
fitted. The estimated volatility at a given time point is then
the product of the estimate of the corresponding scale function value
and of the estimated conditional standard deviation (following the parametric
model part) for that same time point. See for example Feng et al. (2022)
or Letmathe et al. (2023) for more information on the semiparametric extension
of volatility models.
The order for manual settings of start_pars
, LB
and UB
is crucial. The correct order is: \(\mu\), \(\text{ar}_1,\dots,\text{ar}_{p^{*}}\),
\(\text{ma}_1,\dots,\text{ma}_{q^{*}}\),\(D\),\(\omega\),
\(\phi\), \(\beta\), \(d\), shape parameter,
skewness parameter. Depending on the exact model specification,
parameters irrelevant for the specification at hand should be dropped
in start_pars
, LB
and UB
.
Baillie, R., Bollerslev, T., & Mikkelsen, H. O. (1996). Fractionally integrated generalized autoregressive conditional heteroskedasticity. Journal of Econometrics, 74(1), 3-30. DOI: 10.1016/S0304-4076(95)01749-6.
Bollerslev, T., & Mikkelsen, H. O. (1996). Modeling and pricing long memory in stock market volatility. Journal of Econometrics, 73(1): 151-184. DOI: 10.1016/0304-4076(95)01736-4.
Conrad, C., & Haag, B. R. (2006). Inequality constraints in the fractionally integrated GARCH model. Journal of Financial Econometrics, 4(3): 413-449. DOI: 10.1093/jjfinec/nbj015.
Conrad, C., & Karanasos, M. (2006). The impulse response function of the long memory GARCH process. Economics Letters, 90(1): 34-41. DOI: 10.1016/j.econlet.2005.07.001.
Feng, Y., Gries, T., Letmathe, S., & Schulz, D. (2022). The smoots Package in R for Semiparametric Modeling of Trend Stationary Time Series. The R Journal, 14(1), 182-195. URL: https://journal.r-project.org/articles/RJ-2022-017/.
Karanasos, M., Psaradakis, Z., & Sola, M. (2004). On the autocorrelation properties of long-memory GARCH processes. Journal of Time Series Analysis, 25(2): 265-281. DOI: 10.1046/j.0143-9782.2003.00349.x.
Letmathe, S., Beran, J., & Feng, Y. (2023). An extended exponential SEMIFAR model with application in R. Communications in Statistics - Theory and Methods, 53(22), 7914–7926. DOI: 10.1080/03610926.2023.2276049.
Nielsen, M. O., & Noel, A. L. (2021). To infinity and beyond: Efficient computation of ARCH(\(\infty\)) models. Journal of Time Series Analysis, 42(3), 338–354. DOI: 10.1111/jtsa.12570.
window.zoo <- get("window.zoo", envir = asNamespace("zoo"))
rt <- window.zoo(SP500, end = "2002-12-31")
model <- figarch(rt)
model
Run the code above in your browser using DataLab