fitGMAR
estimates GMAR or StMAR model in two phases. It uses genetic algorithm to find parameter values close to the maximum point
of the log-likelihood function and then uses them as starting values for quasi-Newton method to find the maximum point.
fitGMAR(data, p, M, StMAR = FALSE, restricted = FALSE,
constraints = FALSE, R, conditional = TRUE, nCalls, multicore = TRUE,
ncores, initpop = FALSE, ngen, popsize, smartMu, ar0scale, sigmascale,
printRes = TRUE, runTests = FALSE)
a numeric vector or column matrix containing the data. NA
values are not supported.
a positive integer specifying the order of AR coefficients.
a positive integer specifying the number of mixture components or regimes.
an (optional) logical argument stating whether StMAR model should be considered instead of GMAR model. Default is FALSE
.
an (optional) logical argument stating whether the AR coefficients \(\phi_{m,1},...,\phi_{m,p}\) are restricted
to be the same for all regimes. Default is FALSE
.
an (optional) logical argument stating whether general linear constraints should be applied to the model. Default is FALSE
.
Specifies the linear constraints.
a list of size \((pxq_{m})\) constraint matrices \(R_{m}\) of full column rank satisfying \(\phi_{m}\)\(=\)\(R_{m}\psi_{m}\) for all \(m=1,...,M\), where \(\phi_{m}\)\(=(\phi_{m,1},...,\phi_{m,p})\) and \(\psi_{m}\)\(=(\psi_{m,1},...,\psi_{m,q_{m}})\).
a size \((pxq)\) constraint matrix \(R\) of full column rank satisfying \(\phi\)\(=\)\(R\psi\), where \(\phi\)\(=(\phi_{1},...,\phi_{p})\) and \(\psi\)\(=\psi_{1},...,\psi_{q}\).
Symbol \(\phi\) denotes an AR coefficient. Note that regardless of any constraints, the nominal order of AR coefficients is alway p
for all regimes.
This argument is ignored if constraints==FALSE
.
an (optional) logical argument specifying wether the conditional or exact log-likehood function should be used. Default is TRUE
.
an (optional) positive integer specifying how many rounds of estimation should be done.
The estimation results may vary from round to round because of multimodality of the log-likelihood function
and randomness associated with the genetic algorithm. Default is round(10 + 9*log(M))
.
an (optional) logical argument defining whether parallel computing should be used in the estimation process.
Highly recommended and default is TRUE
.
an (optional) positive integer specifying the number of cores to be used in the estimation process.
Default is that the number of available cores is detected with parallel::detectCores()
and all them are used. Ignored if multicore==FALSE
.
an (optional) list of parameter vectors from which the initial population of the genetic algorithm will be generated from. The parameter vectors should be of form...
Size \((M(p+3)-1x1)\) vector \(\theta\)\(=\)(\(\upsilon_{1}\),...,\(\upsilon_{M}\), \(\alpha_{1},...,\alpha_{M-1}\)), where \(\upsilon_{m}\)\(=(\phi_{m,0},\)\(\phi_{m}\)\(, \sigma_{m}^2)\) and \(\phi_{m}\)=\((\phi_{m,1},...,\phi_{m,p}), m=1,...,M\).
Size \((M(p+4)-1x1)\) vector (\(\theta, \nu\))\(=\)(\(\upsilon_{1}\),...,\(\upsilon_{M}\), \(\alpha_{1},...,\alpha_{M-1}, \nu_{1},...,\nu_{M}\)).
Replace the vectors \(\phi_{m}\) with vectors \(\psi_{m}\) and provide a list of constraint matrices R that satisfy \(\phi_{m}\)\(=\)\(R_{m}\psi_{m}\) for all \(m=1,...,M\), where \(\psi_{m}\)\(=(\psi_{m,1},...,\psi_{m,q_{m}})\).
Size \((3M+p-1x1)\) vector \(\theta\)\(=(\phi_{1,0},...,\phi_{M,0},\)\(\phi\)\(, \sigma_{1}^2,...,\sigma_{M}^2,\alpha_{1},...,\alpha_{M-1})\), where \(\phi\)=\((\phi_{1},...,\phi_{M})\).
Size \((4M+p-1x1)\) vector (\(\theta, \nu\))\(=(\phi_{1,0},...,\phi_{M,0},\)\(\phi\)\(, \sigma_{1}^2,...,\sigma_{M}^2,\alpha_{1},...,\alpha_{M-1}, \nu_{1},...,\nu_{M})\).
Replace the vector \(\phi\) with vector \(\psi\) and provide a constraint matrix \(R\) that satisfies \(\phi\)\(=\)\(R\psi\), where \(\psi\)\(=(\psi_{1},...,\psi_{q})\).
Symbol \(\phi\) denotes an AR coefficient, \(\sigma^2\) a variance, \(\alpha\) a mixing weight and \(v\) a degrees of
freedom parameter.
Note that in the case M=1 the parameter \(\alpha\) is dropped, and in the case of StMAR model
the degrees of freedom parameters \(\nu_{m}\) have to be larger than \(2\).
If not specified (or FALSE
as is default), the initial population will be drawn randomly.
an (optional) positive integer specifying the number of generations to be ran through in the genetic algorithm. Default is min(400, max(round(0.1*length(data)), 200))
.
an (optional) positive even integer specifying the population in size in the genetic algorithm. Default is 10*d
where d
is the number of parameters.
an (optional) positive integer specifying the generation after which the random mutations in the genetic algorithm are "smart".
This means that mutating individuals will mostly mutate fairly close to the best fitting individual so far. Default is min(100, round(0.5*ngen))
.
an (optional) real valued vector of length two specifying the mean (the first element) and standard deviation (the second element) of the normal distribution
from which the \(\phi_{m,0}\) parameters are generated in the random mutations in the genetic algorithm. Default is c(1.5*avg*(1-c1/c0), max(c0, 4))
, where
avg is sample mean, c1
is the first sample autocovariance and c0
is sample variance.
an (optional) positive real number specifying the standard deviation of the (zero mean, positive only) normal distribution
from which the component variance parameters are generated in the random mutations in the genetic algorithm. Default is 1+sd(data)
.
an (optional) logical argument defining whether results should be printed or not. Default is TRUE
.
an (optional) logical argument defining whether quantile residual tests for the estimated model should be performed or not. Default is FALSE
.
Returns a list with...
$estimates
The estimated parameter vector...
Size \((M(p+3)-1x1)\) vector \(\theta\)\(=\)(\(\upsilon_{1}\),...,\(\upsilon_{M}\), \(\alpha_{1},...,\alpha_{M-1}\)), where \(\upsilon_{m}\)\(=(\phi_{m,0},\)\(\phi_{m}\)\(, \sigma_{m}^2)\) and \(\phi_{m}\)=\((\phi_{m,1},...,\phi_{m,p}), m=1,...,M\).
Size \((M(p+4)-1x1)\) vector (\(\theta, \nu\))\(=\)(\(\upsilon_{1}\),...,\(\upsilon_{M}\), \(\alpha_{1},...,\alpha_{M-1}, \nu_{1},...,\nu_{M}\)).
Parameter vector as descripted above, but vectors \(\phi_{m}\) replaced with vectors \(\psi_{m}\) that satisfy \(\phi_{m}\)\(=\)\(R_{m}\psi_{m}\) for all \(m=1,...,M\), where \(\psi_{m}\)\(=(\psi_{m,1},...,\psi_{m,q_{m}})\).
Size \((3M+p-1x1)\) vector \(\theta\)\(=(\phi_{1,0},...,\phi_{M,0},\)\(\phi\)\(, \sigma_{1}^2,...,\sigma_{M}^2,\alpha_{1},...,\alpha_{M-1})\), where \(\phi\)=\((\phi_{1},...,\phi_{M})\).
Size \((4M+p-1x1)\) vector (\(\theta, \nu\))\(=(\phi_{1,0},...,\phi_{M,0},\)\(\phi\)\(, \sigma_{1}^2,...,\sigma_{M}^2,\alpha_{1},...,\alpha_{M-1}, \nu_{1},...,\nu_{M})\).
Parameter vector as descripted above, but vector \(\phi\) replaced with vector \(\psi\) that satisfies \(\phi\)\(=\)\(R\psi\), where \(\psi\)\(=(\psi_{1},...,\psi_{q})\).
$stdErrors
Approximate standard errors of the estimates. NA
values may sometimes occur because the observed information matrix is numerically estimated.
$loglikelihood
Log-likelihood value of the estimated model.
$IC
A data frame containing information criteria scores of the estimated model: $AIC
, $BIC
, $HQIC
.
$quantileResiduals
A numeric vector containing the quantile residuals of the estimated model.
$mixingWeights
A numeric matrix containing the mixing weights of the estimated model (i:th column for i:th regime).
$allEstimates
A list of estimated parameter vectors from all of the estimation rounds.
$allLoglikelihoods
A numeric vector containing the log-likelihood values from all of the estimation rounds. Corresponds to $allEstimates
.
$converged
A logical vector containing information whether the quasi-Newton algorithm converged successfully or not. Corresponds to $allEstimates
.
$normality
A data frame containing results from the normality test. Returned only if runTests==TRUE
.
$autocorrelation
A data frame containing results from the autocorrelation tests. Returned only if runTests==TRUE
.
$cond.heteroscedasticity
A data frame containing results from the conditional heteroscedasticity tests. Returned only if runTests==TRUE
.
$unconstrainedEstimates
A numeric parameter vector denoting the estimates without any constraints (if given any). That is instead of
vectors \(\psi_{m}\) the estimates are parametrized with vectors \(\phi_{m}\) calculated from
\(\phi_{m}\)\(=\)\(R_{m}\psi_{m}\), or in the case of restricted models
\(\phi\)\(=\)\(R\psi\). Returned only if constraints==TRUE
.
The results printed out regarding the genetic algorithm and quasi-Newton estimations are the log-likelihood values the algorithms ended up with. The lowest value, mean value and largest value are printed to give perspective.
If quantile residual tests are run, the results from the tests are printed so that the letter "N" means normality test, "A" autocorrelation test
and "H" conditional heteroscedasticity test. The numbers right next to "A" and "H" indicate the number of lags used
in each test. The statistics following them are the corresponding test statistics and p-values.
NA
values mean that it was not (numerically) possible for the code to calculate all the necessary estimates for the tests.
Install the suggested package "pbapply" if you wish to see a progress bar during parallel computing.
For faster evaluation of the quantile residuals of StMAR model install the suggested package "gsl". Note that for large StMAR models with large data the evaluations for the quantile residual tests may take significantly long time without the package "gsl".
The genetic algorithm is mostly based on the description by Dorsey R. E. ja Mayer W. J. (1995). It uses individually adaptive crossover and mutation rates described by Patnaik L.M. and Srinivas M. (1994), with slight modifications.
The quasi-Newton method is implemented with function optim
from the package stats
.
The user should consider adjusting ar0scale
and/or sigmascale
accordingly to the best knowledge about the process.
Note that fitGMAR
can't verify whether the found estimates denote the global or just a local maximum point.
For more reliable results one should increase the number of estimation rounds (nCalls
) to be performed.
Kalliovirta L., Meitz M. and Saikkonen P. (2015) Gaussian Mixture Autoregressive model for univariate time series. Journal of Time Series Analysis, 36, 247-266.
Kalliovirta L. (2012) Misspecification tests based on quantile residuals. The Econometrics Journal, 15, 358-393.
Dorsey R. E. ja Mayer W. J. (1995) Genetic algorithms for estimation problems with multiple optima, nondifferentiability, and other irregular features. Journal of Business & Economic Statistics, 13, 53-66.
Patnaik L.M. and Srinivas M. (1994) Adaptive Probabilities of Crossover and Mutation in Genetic Algorithms. Transactions on Systems, Man and Cybernetics 24, 656-667.
Lutkepohl H. New Introduction to Multiple Time Series Analysis, Springer, 2005.
Galbraith, R., Galbraith, J., (1974). On the inverses of some patterned matrices arising in the theory of stationary time series. Journal of Applied Probability 11, 63-71.
References regarding the StMAR model and general linear constraints will be updated after they are published.
# NOT RUN {
# GMAR model
fit12 <- fitGMAR(VIX, 1, 2, ar0scale=c(3, 2), runTests=TRUE)
# Restricted GMAR model
fit12r <- fitGMAR(VIX, 1, 2, restricted=TRUE, nCalls=10,
runTests=TRUE)
# StMAR model
fit12t <- fitGMAR(VIX, 1, 2, StMAR=TRUE, ar0scale=c(3, 2))
# Non-mixture version of StMAR model: without multicore
fit11t <- fitGMAR(VIX, 1, 1, StMAR=TRUE, multicore=FALSE, nCalls=4)
# Fit GMAR model that is a mixture of AR(1) and such AR(3) model that the
# second AR coeffiecient is constrained to zero.
R <- list(matrix(c(1, 0, 0, 0, 0, 1), ncol=2), as.matrix(c(1, 0, 0)))
fit32c <- fitGMAR(VIX, 3, 2, constraints=TRUE, R=R, ar0scale=c(3, 2))
# Fit such constrained StMAR(3, 1) model that the second order AR coefficient
# is constrained to zero.
R0 <- matrix(c(1, 0, 0, 0, 0, 1), ncol=2)
fit31tc <- fitGMAR(VIX, 3, 1, StMAR=TRUE, constraints=TRUE, R=list(R0))
# Fit such StMAR(3,2) that the AR coefficients are restricted to be
# the same for both regimes and that the second AR coefficients are
# constrained to zero.
fit32trc <- fitGMAR(VIX, 3, 2, StMAR=TRUE, restricted=TRUE, constraints=TRUE,
R=matrix(c(1, 0, 0, 0, 0, 1), ncol=2))
# }
Run the code above in your browser using DataLab