fitGMAR: Estimate Gaussian or Student's t Mixture Autoregressive model

Description

fitGMAR estimates GMAR or StMAR model in two phases. It uses genetic algorithm to find parameter values close to the maximum point of the log-likelihood function and then uses them as starting values for quasi-Newton method to find the maximum point.

Usage

fitGMAR(data, p, M, StMAR = FALSE, restricted = FALSE,
  constraints = FALSE, R, conditional = TRUE, nCalls, multicore = TRUE,
  ncores, initpop = FALSE, ngen, popsize, smartMu, ar0scale, sigmascale,
  printRes = TRUE, runTests = FALSE)

Arguments

data

a numeric vector or column matrix containing the data. NA values are not supported.

a positive integer specifying the order of AR coefficients.

a positive integer specifying the number of mixture components or regimes.

StMAR

an (optional) logical argument stating whether StMAR model should be considered instead of GMAR model. Default is FALSE.

restricted

an (optional) logical argument stating whether the AR coefficients $\phi_{m,1},...,\phi_{m,p}$ are restricted to be the same for all regimes. Default is FALSE.

constraints

an (optional) logical argument stating whether general linear constraints should be applied to the model. Default is FALSE.

Specifies the linear constraints.

For non-restricted models:: a list of size $(pxq_{m})$ constraint matrices $R_{m}$ of full column rank satisfying $\phi_{m}$$=$$R_{m}\psi_{m}$ for all $m=1,...,M$, where $\phi_{m}$$=(\phi_{m,1},...,\phi_{m,p})$ and $\psi_{m}$$=(\psi_{m,1},...,\psi_{m,q_{m}})$.
For restricted models:: a size $(pxq)$ constraint matrix $R$ of full column rank satisfying $\phi$$=$$R\psi$, where $\phi$$=(\phi_{1},...,\phi_{p})$ and $\psi$$=\psi_{1},...,\psi_{q}$.

Symbol $\phi$ denotes an AR coefficient. Note that regardless of any constraints, the nominal order of AR coefficients is alway p for all regimes. This argument is ignored if constraints==FALSE.

conditional

an (optional) logical argument specifying wether the conditional or exact log-likehood function should be used. Default is TRUE.

nCalls

an (optional) positive integer specifying how many rounds of estimation should be done. The estimation results may vary from round to round because of multimodality of the log-likelihood function and randomness associated with the genetic algorithm. Default is round(10 + 9*log(M)).

multicore

an (optional) logical argument defining whether parallel computing should be used in the estimation process. Highly recommended and default is TRUE.

ncores

an (optional) positive integer specifying the number of cores to be used in the estimation process. Default is that the number of available cores is detected with parallel::detectCores() and all them are used. Ignored if multicore==FALSE.

initpop

an (optional) list of parameter vectors from which the initial population of the genetic algorithm will be generated from. The parameter vectors should be of form...

For non-restricted models:

For GMAR model:: Size $(M(p+3)-1x1)$ vector $\theta$$=$($\upsilon_{1}$,...,$\upsilon_{M}$, $\alpha_{1},...,\alpha_{M-1}$), where $\upsilon_{m}$$=(\phi_{m,0},$$\phi_{m}$$, \sigma_{m}^2)$ and $\phi_{m}$=$(\phi_{m,1},...,\phi_{m,p}), m=1,...,M$.
For StMAR model:: Size $(M(p+4)-1x1)$ vector ($\theta, \nu$)$=$($\upsilon_{1}$,...,$\upsilon_{M}$, $\alpha_{1},...,\alpha_{M-1}, \nu_{1},...,\nu_{M}$).
With linear constraints:: Replace the vectors $\phi_{m}$ with vectors $\psi_{m}$ and provide a list of constraint matrices R that satisfy $\phi_{m}$$=$$R_{m}\psi_{m}$ for all $m=1,...,M$, where $\psi_{m}$$=(\psi_{m,1},...,\psi_{m,q_{m}})$.

For restricted models:

For GMAR model:: Size $(3M+p-1x1)$ vector $\theta$$=(\phi_{1,0},...,\phi_{M,0},$$\phi$$, \sigma_{1}^2,...,\sigma_{M}^2,\alpha_{1},...,\alpha_{M-1})$, where $\phi$=$(\phi_{1},...,\phi_{M})$.
For StMAR model:: Size $(4M+p-1x1)$ vector ($\theta, \nu$)$=(\phi_{1,0},...,\phi_{M,0},$$\phi$$, \sigma_{1}^2,...,\sigma_{M}^2,\alpha_{1},...,\alpha_{M-1}, \nu_{1},...,\nu_{M})$.
With linear constraints:: Replace the vector $\phi$ with vector $\psi$ and provide a constraint matrix $R$ that satisfies $\phi$$=$$R\psi$, where $\psi$$=(\psi_{1},...,\psi_{q})$.

Symbol $\phi$ denotes an AR coefficient, $\sigma^2$ a variance, $\alpha$ a mixing weight and $v$ a degrees of freedom parameter. Note that in the case M=1 the parameter $\alpha$ is dropped, and in the case of StMAR model the degrees of freedom parameters $\nu_{m}$ have to be larger than $2$. If not specified (or FALSE as is default), the initial population will be drawn randomly.

ngen

an (optional) positive integer specifying the number of generations to be ran through in the genetic algorithm. Default is min(400, max(round(0.1*length(data)), 200)).

popsize

an (optional) positive even integer specifying the population in size in the genetic algorithm. Default is 10*d where d is the number of parameters.

smartMu

an (optional) positive integer specifying the generation after which the random mutations in the genetic algorithm are "smart". This means that mutating individuals will mostly mutate fairly close to the best fitting individual so far. Default is min(100, round(0.5*ngen)).

ar0scale

an (optional) real valued vector of length two specifying the mean (the first element) and standard deviation (the second element) of the normal distribution from which the $\phi_{m,0}$ parameters are generated in the random mutations in the genetic algorithm. Default is c(1.5*avg*(1-c1/c0), max(c0, 4)), where avg is sample mean, c1 is the first sample autocovariance and c0 is sample variance.

sigmascale

an (optional) positive real number specifying the standard deviation of the (zero mean, positive only) normal distribution from which the component variance parameters are generated in the random mutations in the genetic algorithm. Default is 1+sd(data).

printRes

an (optional) logical argument defining whether results should be printed or not. Default is TRUE.

runTests

an (optional) logical argument defining whether quantile residual tests for the estimated model should be performed or not. Default is FALSE.

Value

Returns a list with...

$estimates

The estimated parameter vector...

For non-restricted models:

For GMAR model:: Size $(M(p+3)-1x1)$ vector $\theta$$=$($\upsilon_{1}$,...,$\upsilon_{M}$, $\alpha_{1},...,\alpha_{M-1}$), where $\upsilon_{m}$$=(\phi_{m,0},$$\phi_{m}$$, \sigma_{m}^2)$ and $\phi_{m}$=$(\phi_{m,1},...,\phi_{m,p}), m=1,...,M$.
For StMAR model:: Size $(M(p+4)-1x1)$ vector ($\theta, \nu$)$=$($\upsilon_{1}$,...,$\upsilon_{M}$, $\alpha_{1},...,\alpha_{M-1}, \nu_{1},...,\nu_{M}$).
With linear constraints:: Parameter vector as descripted above, but vectors $\phi_{m}$ replaced with vectors $\psi_{m}$ that satisfy $\phi_{m}$$=$$R_{m}\psi_{m}$ for all $m=1,...,M$, where $\psi_{m}$$=(\psi_{m,1},...,\psi_{m,q_{m}})$.

For restricted models:

For GMAR model:: Size $(3M+p-1x1)$ vector $\theta$$=(\phi_{1,0},...,\phi_{M,0},$$\phi$$, \sigma_{1}^2,...,\sigma_{M}^2,\alpha_{1},...,\alpha_{M-1})$, where $\phi$=$(\phi_{1},...,\phi_{M})$.
For StMAR model:: Size $(4M+p-1x1)$ vector ($\theta, \nu$)$=(\phi_{1,0},...,\phi_{M,0},$$\phi$$, \sigma_{1}^2,...,\sigma_{M}^2,\alpha_{1},...,\alpha_{M-1}, \nu_{1},...,\nu_{M})$.
With linear constraints:: Parameter vector as descripted above, but vector $\phi$ replaced with vector $\psi$ that satisfies $\phi$$=$$R\psi$, where $\psi$$=(\psi_{1},...,\psi_{q})$.

Symbol $\phi$ denotes an AR coefficient, $\sigma^2$ a variance, $\alpha$ a mixing weight and $\nu$ a degrees of freedom parameter.

$stdErrors

Approximate standard errors of the estimates. NA values may sometimes occur because the observed information matrix is numerically estimated.

$loglikelihood

Log-likelihood value of the estimated model.

$IC

A data frame containing information criteria scores of the estimated model: $AIC, $BIC, $HQIC.

$quantileResiduals

A numeric vector containing the quantile residuals of the estimated model.

$mixingWeights

A numeric matrix containing the mixing weights of the estimated model (i:th column for i:th regime).

$allEstimates

A list of estimated parameter vectors from all of the estimation rounds.

$allLoglikelihoods

A numeric vector containing the log-likelihood values from all of the estimation rounds. Corresponds to $allEstimates.

$converged

A logical vector containing information whether the quasi-Newton algorithm converged successfully or not. Corresponds to $allEstimates.

$normality

A data frame containing results from the normality test. Returned only if runTests==TRUE.

$autocorrelation

A data frame containing results from the autocorrelation tests. Returned only if runTests==TRUE.

$cond.heteroscedasticity

A data frame containing results from the conditional heteroscedasticity tests. Returned only if runTests==TRUE.

$unconstrainedEstimates

A numeric parameter vector denoting the estimates without any constraints (if given any). That is instead of vectors $\psi_{m}$ the estimates are parametrized with vectors $\phi_{m}$ calculated from $\phi_{m}$$=$$R_{m}\psi_{m}$, or in the case of restricted models $\phi$$=$$R\psi$. Returned only if constraints==TRUE.

Printed results

The results printed out regarding the genetic algorithm and quasi-Newton estimations are the log-likelihood values the algorithms ended up with. The lowest value, mean value and largest value are printed to give perspective.

If quantile residual tests are run, the results from the tests are printed so that the letter "N" means normality test, "A" autocorrelation test and "H" conditional heteroscedasticity test. The numbers right next to "A" and "H" indicate the number of lags used in each test. The statistics following them are the corresponding test statistics and p-values. NA values mean that it was not (numerically) possible for the code to calculate all the necessary estimates for the tests.

Suggested packages

Install the suggested package "pbapply" if you wish to see a progress bar during parallel computing.

For faster evaluation of the quantile residuals of StMAR model install the suggested package "gsl". Note that for large StMAR models with large data the evaluations for the quantile residual tests may take significantly long time without the package "gsl".

The optimization algorithms

The genetic algorithm is mostly based on the description by Dorsey R. E. ja Mayer W. J. (1995). It uses individually adaptive crossover and mutation rates described by Patnaik L.M. and Srinivas M. (1994), with slight modifications.

The quasi-Newton method is implemented with function optim from the package stats.

Details

The user should consider adjusting ar0scale and/or sigmascale accordingly to the best knowledge about the process.

Note that fitGMAR can't verify whether the found estimates denote the global or just a local maximum point. For more reliable results one should increase the number of estimation rounds (nCalls) to be performed.

References

Kalliovirta L., Meitz M. and Saikkonen P. (2015) Gaussian Mixture Autoregressive model for univariate time series. Journal of Time Series Analysis, 36, 247-266.
Kalliovirta L. (2012) Misspecification tests based on quantile residuals. The Econometrics Journal, 15, 358-393.
Dorsey R. E. ja Mayer W. J. (1995) Genetic algorithms for estimation problems with multiple optima, nondifferentiability, and other irregular features. Journal of Business & Economic Statistics, 13, 53-66.
Patnaik L.M. and Srinivas M. (1994) Adaptive Probabilities of Crossover and Mutation in Genetic Algorithms. Transactions on Systems, Man and Cybernetics 24, 656-667.
Lutkepohl H. New Introduction to Multiple Time Series Analysis, Springer, 2005.
Galbraith, R., Galbraith, J., (1974). On the inverses of some patterned matrices arising in the theory of stationary time series. Journal of Applied Probability 11, 63-71.
References regarding the StMAR model and general linear constraints will be updated after they are published.

Examples

Run this code

# NOT RUN {
# GMAR model
fit12 <- fitGMAR(VIX, 1, 2, ar0scale=c(3, 2), runTests=TRUE)

# Restricted GMAR model
fit12r <- fitGMAR(VIX, 1, 2, restricted=TRUE, nCalls=10,
                  runTests=TRUE)

# StMAR model
fit12t <- fitGMAR(VIX, 1, 2, StMAR=TRUE, ar0scale=c(3, 2))

# Non-mixture version of StMAR model: without multicore
fit11t <- fitGMAR(VIX, 1, 1, StMAR=TRUE, multicore=FALSE, nCalls=4)

# Fit GMAR model that is a mixture of AR(1) and such AR(3) model that the
# second AR coeffiecient is constrained to zero.
R <- list(matrix(c(1, 0, 0, 0, 0, 1), ncol=2), as.matrix(c(1, 0, 0)))
fit32c <- fitGMAR(VIX, 3, 2, constraints=TRUE, R=R, ar0scale=c(3, 2))

# Fit such constrained StMAR(3, 1) model that the second order AR coefficient
# is constrained to zero.
R0 <- matrix(c(1, 0, 0, 0, 0, 1), ncol=2)
fit31tc <- fitGMAR(VIX, 3, 1, StMAR=TRUE, constraints=TRUE, R=list(R0))

# Fit such StMAR(3,2) that the AR coefficients are restricted to be
# the same for both regimes and that the second AR coefficients are
# constrained to zero.
fit32trc <- fitGMAR(VIX, 3, 2, StMAR=TRUE, restricted=TRUE, constraints=TRUE,
                    R=matrix(c(1, 0, 0, 0, 0, 1), ncol=2))
# }

Run the code above in your browser using DataLab