GAfit: Genetic algorithm for preliminary estimation of GMAR or StMAR model

Description

GAfit estimates specified GMAR or StMAR model using genetic algorithm. It's designed to find starting values for gradient based methods.

Usage

GAfit(data, p, M, StMAR = FALSE, restricted = FALSE, constraints = FALSE,
  R, conditional = TRUE, ngen, popsize, smartMu, ar0scale, sigmascale,
  initpop = FALSE, epsilon, minval)

Arguments

data

a numeric vector or column matrix containing the data. NA values are not supported.

a positive integer specifying the order of AR coefficients.

a positive integer specifying the number of mixture components or regimes.

StMAR

an (optional) logical argument stating whether StMAR model should be considered instead of GMAR model. Default is FALSE.

restricted

an (optional) logical argument stating whether the AR coefficients $\phi_{m,1},...,\phi_{m,p}$ are restricted to be the same for all regimes. Default is FALSE.

constraints

an (optional) logical argument stating whether general linear constraints should be applied to the model. Default is FALSE.

Specifies the linear constraints.

For non-restricted models:: a list of size $(pxq_{m})$ constraint matrices $R_{m}$ of full column rank satisfying $\phi_{m}$$=$$R_{m}\psi_{m}$ for all $m=1,...,M$, where $\phi_{m}$$=(\phi_{m,1},...,\phi_{m,p})$ and $\psi_{m}$$=(\psi_{m,1},...,\psi_{m,q_{m}})$.
For restricted models:: a size $(pxq)$ constraint matrix $R$ of full column rank satisfying $\phi$$=$$R\psi$, where $\phi$$=(\phi_{1},...,\phi_{p})$ and $\psi$$=\psi_{1},...,\psi_{q}$.

Symbol $\phi$ denotes an AR coefficient. Note that regardless of any constraints, the nominal order of AR coefficients is alway p for all regimes. This argument is ignored if constraints==FALSE.

conditional

an (optional) logical argument specifying wether the conditional or exact log-likehood function should be used. Default is TRUE.

ngen

an (optional) positive integer specifying the number of generations to be ran through in the genetic algorithm. Default is min(400, max(round(0.1*length(data)), 200)).

popsize

an (optional) positive even integer specifying the population in size in the genetic algorithm. Default is 10*d where d is the number of parameters.

smartMu

an (optional) positive integer specifying the generation after which the random mutations in the genetic algorithm are "smart". This means that mutating individuals will mostly mutate fairly close to the best fitting individual so far. Default is min(100, round(0.5*ngen)).

ar0scale

an (optional) real valued vector of length two specifying the mean (the first element) and standard deviation (the second element) of the normal distribution from which the $\phi_{m,0}$ parameters are generated in the random mutations in the genetic algorithm. Default is c(1.5*avg*(1-c1/c0), max(c0, 4)), where avg is sample mean, c1 is the first sample autocovariance and c0 is sample variance.

sigmascale

an (optional) positive real number specifying the standard deviation of the (zero mean, positive only) normal distribution from which the component variance parameters are generated in the random mutations in the genetic algorithm. Default is 1+sd(data).

initpop

an (optional) list of parameter vectors from which the initial population of the genetic algorithm will be generated from. The parameter vectors should be of form...

For non-restricted models:

For GMAR model:: Size $(M(p+3)-1x1)$ vector $\theta$$=$($\upsilon_{1}$,...,$\upsilon_{M}$, $\alpha_{1},...,\alpha_{M-1}$), where $\upsilon_{m}$$=(\phi_{m,0},$$\phi_{m}$$, \sigma_{m}^2)$ and $\phi_{m}$=$(\phi_{m,1},...,\phi_{m,p}), m=1,...,M$.
For StMAR model:: Size $(M(p+4)-1x1)$ vector ($\theta, \nu$)$=$($\upsilon_{1}$,...,$\upsilon_{M}$, $\alpha_{1},...,\alpha_{M-1}, \nu_{1},...,\nu_{M}$).
With linear constraints:: Replace the vectors $\phi_{m}$ with vectors $\psi_{m}$ and provide a list of constraint matrices R that satisfy $\phi_{m}$$=$$R_{m}\psi_{m}$ for all $m=1,...,M$, where $\psi_{m}$$=(\psi_{m,1},...,\psi_{m,q_{m}})$.

For restricted models:

For GMAR model:: Size $(3M+p-1x1)$ vector $\theta$$=(\phi_{1,0},...,\phi_{M,0},$$\phi$$, \sigma_{1}^2,...,\sigma_{M}^2,\alpha_{1},...,\alpha_{M-1})$, where $\phi$=$(\phi_{1},...,\phi_{M})$.
For StMAR model:: Size $(4M+p-1x1)$ vector ($\theta, \nu$)$=(\phi_{1,0},...,\phi_{M,0},$$\phi$$, \sigma_{1}^2,...,\sigma_{M}^2,\alpha_{1},...,\alpha_{M-1}, \nu_{1},...,\nu_{M})$.
With linear constraints:: Replace the vector $\phi$ with vector $\psi$ and provide a constraint matrix $R$ that satisfies $\phi$$=$$R\psi$, where $\psi$$=(\psi_{1},...,\psi_{q})$.

Symbol $\phi$ denotes an AR coefficient, $\sigma^2$ a variance, $\alpha$ a mixing weight and $v$ a degrees of freedom parameter. Note that in the case M=1 the parameter $\alpha$ is dropped, and in the case of StMAR model the degrees of freedom parameters $\nu_{m}$ have to be larger than $2$. If not specified (or FALSE as is default), the initial population will be drawn randomly.

epsilon

an (optional) negative real number specifying the logarithm of the smallest positive non-zero number that will be handled without external packages. Too small value may lead to a failure or biased results and too large value will make the code run significantly slower. Default is round(log(.Machine$double.xmin)+10) and should not be adjusted too much.

minval

a real number defining the minimum value of the log-likelihood function that will be considered. Values smaller than this will be treated as they were minval and the corresponding inidividuals will never survive.

Value

Returns estimated parameter vector...

For non-restricted models:

For GMAR model:: Size $(M(p+3)-1x1)$ vector $\theta$$=$($\upsilon_{1}$,...,$\upsilon_{M}$, $\alpha_{1},...,\alpha_{M-1}$), where $\upsilon_{m}$$=(\phi_{m,0},$$\phi_{m}$$, \sigma_{m}^2)$ and $\phi_{m}$=$(\phi_{m,1},...,\phi_{m,p}), m=1,...,M$.
For StMAR model:: Size $(M(p+4)-1x1)$ vector ($\theta, \nu$)$=$($\upsilon_{1}$,...,$\upsilon_{M}$, $\alpha_{1},...,\alpha_{M-1}, \nu_{1},...,\nu_{M}$).
With linear constraints:: Parameter vector as descripted above, but vectors $\phi_{m}$ replaced with vectors $\psi_{m}$ that satisfy $\phi_{m}$$=$$R_{m}\psi_{m}$ for all $m=1,...,M$, where $\psi_{m}$$=(\psi_{m,1},...,\psi_{m,q_{m}})$.

For restricted models:

For GMAR model:: Size $(3M+p-1x1)$ vector $\theta$$=(\phi_{1,0},...,\phi_{M,0},$$\phi$$, \sigma_{1}^2,...,\sigma_{M}^2,\alpha_{1},...,\alpha_{M-1})$, where $\phi$=$(\phi_{1},...,\phi_{M})$.
For StMAR model:: Size $(4M+p-1x1)$ vector ($\theta, \nu$)$=(\phi_{1,0},...,\phi_{M,0},$$\phi$$, \sigma_{1}^2,...,\sigma_{M}^2,\alpha_{1},...,\alpha_{M-1}, \nu_{1},...,\nu_{M})$.
With linear constraints:: Parameter vector as descripted above, but vector $\phi$ replaced with vector $\psi$ that satisfies $\phi$$=$$R\psi$, where $\psi$$=(\psi_{1},...,\psi_{q})$.

Symbol $\phi$ denotes an AR coefficient, $\sigma^2$ a variance, $\alpha$ a mixing weight and $\nu$ a degrees of freedom parameter.

Details

The user should consider adjusting ar0scale and/or sigmascale accordingly to the best knowledge about the process.

The genetic algorithm is mostly based on the description by Dorsey R. E. ja Mayer W. J. (1995). It uses individually adaptive crossover and mutation rates described by Patnaik L.M. and Srinivas M. (1994), with slight modifications.

References

Kalliovirta L., Meitz M. and Saikkonen P. (2015) Gaussian Mixture Autoregressive model for univariate time series. Journal of Time Series Analysis, 36, 247-266.
Dorsey R. E. ja Mayer W. J. (1995) Genetic algorithms for estimation problems with multiple optima, nondifferentiability, and other irregular features. Journal of Business & Economic Statistics, 13, 53-66.
Patnaik L.M. and Srinivas M. (1994) Adaptive Probabilities of Crossover and Mutation in Genetic Algorithms. Transactions on Systems, Man and Cybernetics 24, 656-667.
References regarding the StMAR model and general linear constraints will be updated after they are published.