garch: GARCH Model Fitting

Description

Fit a standard GARCH model under the six most common conditional distributions (and more) to observed data using quasi maximum-likelihood estimation.

Usage

garch(
  rt,
  orders = c(1, 1),
  cond_dist = c("norm", "std", "ged", "ald", "snorm", "sstd", "sged", "sald"),
  meanspec = mean_spec(),
  Drange = c(0, 1),
  nonparspec = locpol_spec(),
  use_nonpar = FALSE,
  n_test = 0,
  start_pars = NULL,
  LB = NULL,
  UB = NULL,
  control = list(),
  control_nonpar = list(),
  mean_after_nonpar = FALSE,
  parallel = TRUE,
  ncores = max(1, future::availableCores() - 1),
  trunc = "none",
  presample = 50,
  Prange = c(1, 5)
)

Value

An object of S4-class "fEGarch_fit_garch"

is returned. It contains the following elements.

pars:: a named numeric vector with the parameter estimates.
se:: a named numeric vector with the obtained standard errors in accordance with the parameter estimates.
vcov_mat:: the variance-covariance matrix of the parameter estimates with named columns and rows.
rt:: the input object rt (or at least the training data, if n_test is greater than zero); if rt was a "zoo" or "ts" object, the formatting is kept.
cmeans:: the estimated conditional means; if rt was a "zoo" or "ts" object, the formatting is also applied to cmeans.
sigt:: the estimated conditional standard deviations (or for use_nonpar = TRUE the estimated total volatilities, i.e. scale function value times conditional standard deviation); if rt was a "zoo" or "ts" object, the formatting is also applied to sigt.
etat:: the obtained residuals; if rt was a "zoo" or "ts" object, the formatting is also applied to etat.
orders:: a two-element numeric vector stating the considered model orders.
cond_dist:: a character value stating the conditional distribution considered in the model fitting.
long_memo:: a logical value stating whether or not long memory was considered in the model fitting.
llhood:: the log-likelihood value obtained at the optimal parameter combination.
inf_criteria:: a named two-element numeric vector with the corresponding AIC (first element) and BIC (second element) of the fitted parametric model part; for purely parametric models, these criteria are valid for the entire model; for semiparametric models, they are only valid for the parametric step and are not valid for the entire model.
meanspec:: the settings for the model in the conditional mean; is an object of class "mean_spec" that is identical to the object passed to the input argument meanspec.
test_obs:: the observations at the end up the input rt reserved for testing following n_test.
scale_fun:: the estimated scale function values, if use_nonpar = TRUE, otherwise NULL; formatting of rt is reused.
nonpar_model:: the estimation object returned by tsmooth for use_nonpar = TRUE.
trunc:: the input argument trunc.

Arguments

rt: the observed series ordered from past to present; can be a numeric vector, a "zoo" class time series object, or a "ts" class time series object.
orders: a two-element numeric vector containing the two model orders $p$ and $q$ (see Details for more information); currently, only the default orders = c(1, 1) is supported; other specifications of a two-element numeric vector will lead to orders = c(1, 1) being run and a warning message being returned.
cond_dist: the conditional distribution to consider as a character object; the default is a conditional normal distribution "norm"; available are also, however, a $t$-distribution ("std"), a generalized error distribution ("ged"), an average Laplace distribution ("ald"), and their four skewed variants ("snorm", "sstd", "sged", "sald").
meanspec: an object of class "mean_spec"; indicates the specifications for the model in the conditional mean.
Drange: a two-element numeric vector that indicates the boundaries of the interval over which to search for the fractional differencing parameter $D$ in a long-memory ARMA-type model in the conditional mean model part; by default, $D$ being searched for on the interval from 0 to $0.5 - 1\times 10^{-6}$; note that specific settings in the arguments LB and UB overwrite this argument.
nonparspec: an object of class "locpol_spec" returned by locpol_spec; defines the settings of the nonparametric smoothing technique for use_nonpar = TRUE.
use_nonpar: a logical indicating whether or not to implement a semiparametric extension of the volatility model defined through spec; see "Details" for more information.
n_test: a single numerical value indicating, how many observations at the end of rt not to include in the fitting process and to reserve for backtesting.
start_pars: the starting parameters for the numerical optimization routine; should be of the same length as the parameter output vector within the output object (also keeping the same order); for NULL, an internally saved default set of values is used; see "Details" for the order of elements; elements should be set with respect to a series rescaled to have sample variance one.
LB: the lower boundaries of the parameters in the numerical optimization routine; should be of the same length as the parameter output vector within the output object (also keeping the same order); for NULL, an internally saved default set of values is used; see "Details" for the order of elements; elements should be set with respect to a series rescaled to have sample variance one.
UB: the upper boundaries of the parameters in the numerical optimization routine; should be of the same length as the parameter output vector within the output object (also keeping the same order); for NULL, an internally saved default set of values is used; see "Details" for the order of elements; elements should be set with respect to a series rescaled to have sample variance one.
control: a list that is passed to control of the function solnp of the package Rsolnp.
control_nonpar: a list containing changes to the arguments for the hyperparameter estimation algorithm in the nonparametric scale function estimation for use_nonpar = TRUE; see "Details" for more information.
mean_after_nonpar: only for use_nonpar = TRUE; considers the unconditional mean of the parametric model part in the QMLE step in a semiparametric model; by default, a zero-mean model is considered for the parametric part in a semiparametric model.
parallel: only relevant for a (skewed) average Laplace (AL) distribution, i.e. if cond_dist in spec is set to cond_dist = "ald" or cond_dist = "sald"; parallel is a logical value indicating whether or not the slices for the positive integer-valued parameter of the SM distribution should be fitted in parallel for a speed boost.
ncores: only relevant for a (skewed) average Laplace (AL) distribution, i.e. if cond_dist in spec is set to cond_dist = "ald" or cond_dist = "sald", and if simultaneously parallel = TRUE; ncores is a single numeric value indicating the number of cores to use for parallel computations.
trunc: a positive integer indicating the finite truncation length of the infinite-order polynomials of the infinite-order representations of the long-memory model parts; the character "none" is an optional input that specifies that truncation should always be applied back to the first (presample) observation time point, i.e. that maximum length filters should be applied at all times.
presample: the presample length for initialization (for extended EGARCH- / Log-GARCH-type models only relevant for the FARIMA-part, as series in log-transformed conditional variance are initialized by zero).
Prange: a two-element vector that indicates the search boundaries for the parameter $P$ in a (skewed) average Laplace distribution.

Details

Let $\left\{r_t\right\}$, with $t \in \mathbb{Z}$ as the time index, be a theoretical time series that follows $$r_t=\mu+\varepsilon_t \text{ with } \varepsilon_t=\sigma_t \eta_t \text{ and } \eta_t \sim \text{IID}(0,1), \text{ where}$$ $$\sigma_t^{2}=\omega+\sum_{i=1}^{p}\phi_i \varepsilon_{t-i}^2 + \sum_{j=1}^{q}\beta_j \sigma_{t-j}^2.$$ Here, $\eta_t\sim\text{IID}(0,1)$ means that the innovations $\eta_t$ are independent and identically distributed (iid) with mean zero and variance one, whereas $\sigma_t > 0$ are the conditional standard deviations in $r_t$. $\phi_i$, $i=1,2,\dots, p$, and $\beta_j$, $j = 1, \dots, q$, are non-negative coefficients. $p$ and $q$ are the model orders definable through the argument orders, where $p$ is the first element and $q$ is the second element in the argument. In addition, we have $\mu = E\left(r_t\right)$ as a real-valued parameter. $\omega > 0$ is the intercept. This overall definition is in accordance with Bollerslev (1986).

In the current package version, standard errors of parameter estimates are computed from the Hessian at the optimum of the log-likelihood using hessian. To ensure numerical stability and applicability to a huge variety of differently scaled data, parametric models are first fitted to data that is scaled to have sample variance 1. Parameter estimates and other quantities are then either retransformed or recalculated afterwards for the original data.

For a conditional average Laplace distribution, an optimal model for each distribution parameter $P$ from 1 to 5 is estimated (assuming that $P$ is then fixed to the corresponding value). Afterwards, $P$ is then estimated by selecting the estimated model among the five fitted models that has the largest log-likelihood. The five models are, by default, fitted simultaneously using parallel programming techniques (see also the arguments parallel and ncores, which are only relevant for a conditional average Laplace distribution). After the optimal model (including the estimate of $P$ called $\hat{P}$) has been determined, $P=\hat{P}$ is seen as fixed to obtain the standard errors via the Hessian matrix for the estimates of the continuous parameters. A standard error for $\hat{P}$ is therefore not obtained and the ones obtained for the remaining estimates do not account for $\hat{P}$.

An ARMA-GARCH or a FARIMA-GARCH can be fitted by adjusting the argument meanspec correspondingly.

As an alternative, a semiparametric extension of the pure models in the conditional variance can be implemented. If use_nonpar = TRUE, meanspec is omitted and before fitting a model in the conditional volatility following the remaining function arguments, a smooth scale function, i.e. a function representing the unconditional standard deviation over time, is being estimated following the specifications in nonparspec and control_nonpar. This preliminary step stabilizes the input series rt, as long-term changes in the unconditional variance are being estimated and removed before the parametric step using tsmooth. control_nonpar can be adjusted following to make changes to the arguments of tsmooth for short-memory specifications. These arguments specify settings for the automated bandwidth selection algorithms implemented by this function. By default, we use the settings InfR = "Nai", bStart = 0.15, cb = 0.05, and method = "lpr" for tsmooth. locpol_spec passed to nonparspec handles more direct settings of the local polynomial smoother itself. See the documentation for these functions to get a detailed overview of these settings. Assume $\{r_t\}$ to be the observed series, where $t = 1, 2, \dots, n$, then $r_t^{*} = r_t - \bar{r}$, with $\bar{r}$ being the arithmetic mean over the observed $r_t$, is computed and subsequently $y_t = \ln\left[\left(r_t^{*}\right)^2\right]$. The subtraction of $\bar{r}$ is necessary so that $r_t^{*}$ are all different from zero almost surely. Once $y_t$ are available, its trend $m(x_t)$, with $x_t$ as the rescaled time on the interval $[0, 1]$, is being estimated using tsmooth and denoted here by $\hat{m}(x_t)$. Then from $\hat{\xi}_t = y_t - \hat{m}(x_t)$ obtain $\hat{C} = -\ln\left\{\sum_{t=1}^{n}\exp\left(\hat{\xi}_t\right)\right\}$, and obtain the estimated scale function as $\hat{s}(x_t)=\exp\left[\left(\hat{\mu}(x_t) - \hat{C}\right) / 2\right]$. The stabilized / standardized version of the series $\left\{r_t\right\}$ is then $\tilde{r}_t = r_t^{*} / \hat{s}(x_t)$, to which a purely parametric volatility model following the remaining function arguments is then fitted. The estimated volatility at a given time point is then the product of the estimate of the corresponding scale function value and of the estimated conditional standard deviation (following the parametric model part) for that same time point. See for example Feng et al. (2022) or Letmathe et al. (2023) for more information on the semiparametric extension of volatility models.

The order for manual settings of start_pars, LB and UB is crucial. The correct order is: $\mu$, $\text{ar}_1,\dots,\text{ar}_{p^{*}}$, $\text{ma}_1,\dots,\text{ma}_{q^{*}}$,$D$,$\omega$, $\phi_1,\dots,\phi_p$, $\beta_1, \dots, \beta_{q}$, shape parameter, skewness parameter. Depending on the exact model specification, parameters irrelevant for the specification at hand should be dropped in start_pars, LB and UB.

References

Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics, 31(3): 307-327. DOI: 10.1016/0304-4076(86)90063-1.
Feng, Y., Gries, T., Letmathe, S., & Schulz, D. (2022). The smoots Package in R for Semiparametric Modeling of Trend Stationary Time Series. The R Journal, 14(1), 182-195. URL: https://journal.r-project.org/articles/RJ-2022-017/.
Letmathe, S., Beran, J., & Feng, Y. (2023). An extended exponential SEMIFAR model with application in R. Communications in Statistics - Theory and Methods, 53(22), 7914–7926. DOI: 10.1080/03610926.2023.2276049.

Examples

Run this code

window.zoo <- get("window.zoo", envir = asNamespace("zoo"))
rt <- window.zoo(SP500, end = "2002-12-31")
model <- garch(rt)
model

Run the code above in your browser using DataLab