Fit an ARIMA model to a univariate time series. This function builds on
the ARIMA model fitting approach used in stats::arima()
by fitting
model parameters via a random restart algorithm.
arima(
x,
order = c(0L, 0L, 0L),
seasonal = list(order = c(0L, 0L, 0L), period = NA),
xreg = NULL,
include.mean = TRUE,
transform.pars = TRUE,
fixed = NULL,
init = NULL,
method = c("CSS-ML", "ML", "CSS"),
n.cond,
SSinit = c("Rossignol2011", "Gardner1980"),
optim.method = "BFGS",
optim.control = list(),
kappa = 1e+06,
diffuseControl = TRUE,
max_iters = 100,
max_repeats = 10,
max_inv_root = 1,
min_inv_root_dist = 0,
eps_tol = 1e-04
)
A list of class c("Arima2", "Arima")
. This list contains all of the
same elements as the output of stats::arima, along with some additional
elements. All elements of the output list are:
coef
A vector of AR, MA, and regression coefficients. These can be extracted by the stats::coef method.
sigma2
The MLE of the variance of the innovations.
var.coef
The estimated variance matrix of the coefficients
coef
, which can be extracted by the stats::vcov method.
mask
A vector containing boolean values, indicating which parameters of the model were estimated.
loglik
The maximized log-likelihood (of the differenced data).
aic
The AIC value corresponding to the log-likelihood.
arma
A compact form of the model specification, as a vector giving the number of AR, MA, seasonal AR and seasonal MA coefficients, plus the period and the number of non-seasonal and seasonal differences.
residuals
The fitted innovations.
call
The matched call.
series
The name of the series x.
code
The convergence value returned by stats::optim.
n.cond
The number of initial observations not used in the fitting.
nobs
The number of observations used for the fitting.
model
A list representing the Kalman Filter used in the fitting.
x
The input time series.
num_starts
Number of restarts before convergence criteria was satisfied.
all_values
Numeric vector of length num_starts
containing the
loglikelihood of every parameter initialization.
a univariate time series
A specification of the non-seasonal part of the ARIMA model: the three integer components \((p, d, q)\) are the AR order, the degree of differencing, and the MA order.
A specification of the seasonal part of the ARIMA
model, plus the period (which defaults to frequency(x)
).
This may be a list with components order
and
period
, or just a numeric vector of length 3 which
specifies the seasonal order
. In the latter case the
default period is used.
Optionally, a vector or matrix of external regressors,
which must have the same number of rows as x
.
Should the ARMA model include a mean/intercept term? The
default is TRUE
for undifferenced series, and it is ignored
for ARIMA models with differencing.
logical; if true, the AR parameters are
transformed to ensure that they remain in the region of
stationarity. Not used for method = "CSS"
. For
method = "ML"
, it has been advantageous to set
transform.pars = FALSE
in some cases, see also fixed
.
optional numeric vector of the same length as the total
number of coefficients to be estimated. It should be of the form
$$(\phi_1, \ldots, \phi_p, \theta_1, \ldots, \theta_q,
\Phi_1, \ldots, \Phi_P, \Theta_1, \ldots, \Theta_Q, \mu),
$$
where \(\phi_i\) are the AR coefficients,
\(\theta_i\) are the MA coefficients,
\(\Phi_i\) are the seasonal AR coefficients,
\(\Theta_i\) are the seasonal MA coefficients and
\(\mu\) is the intercept term. Note that the \(\mu\)
entry is required if and only if include.mean
is TRUE
.
In particular it should not be present if the model is an ARIMA
model with differencing.
The entries of the fixed
vector should consist of the
values at which the user wishes to “fix” the corresponding
coefficient, or NA
if that coefficient should not be
fixed, but estimated.
The argument transform.pars
will be set to FALSE
if any
AR parameters are fixed. A warning will be given if transform.pars
is set to (or left at its default) TRUE
. It may be wise to set
transform.pars = FALSE
even when fixing MA parameters,
especially at values that cause the model to be nearly non-invertible.
optional numeric vector of initial parameter
values. Missing values will be filled in, by zeroes except for
regression coefficients. Values already specified in fixed
will be ignored.
fitting method: maximum likelihood or minimize conditional sum-of-squares. The default (unless there are missing values) is to use conditional-sum-of-squares to find starting values, then maximum likelihood. Can be abbreviated.
only used if fitting by conditional-sum-of-squares: the number of initial observations to ignore. It will be ignored if less than the maximum lag of an AR term.
a string specifying the algorithm to compute the
state-space initialization of the likelihood; see
KalmanLike
for details. Can be abbreviated.
The value passed as the method
argument to
optim
.
List of control parameters for optim
.
the prior variance (as a multiple of the innovations variance) for the past observations in a differenced model. Do not reduce this.
Boolean indicator of whether or initial observations will have likelihood values ignored if controlled by the diffuse prior, i.e., have a Kalman gain of at least 1e4.
Maximum number of random restarts for methods "CSS-ML" and
"ML". If set to 1, the results of this algorithm is the same as
stats::arima()
if argument diffuseControl
is also set as TRUE.
max_iters
is often not reached because the condition
max_repeats
is typically achieved first.
Integer. If the last max_repeats
random starts did
not result in improved likelihoods, then stop the search. Each result of
the optim function is only considered to improve the likelihood if it does
so by more than eps_tol
.
positive numeric value less than or equal to 1. This number represents the maximum size of the inverted MA or AR polynomial roots for a new parameter estimate to be considered an improvement to previous estimates. Concerns of numeric stability arise when the size of polynomial roots are near unity circle. The default value 1 means that the the parameter values corresponding with the best log-likelihood will be returned, even if they are near unity. Suitable values of this parameter are near the value 1.
positive numeric value less than 1. This number represents the minimum distance between AR and MA polynomial roots for a new parameter estimate to be considered an improvement on previous estimates. This is intended to avoid the possibility of returning parameter estimates with nearly canceling roots. Appropriate choices are values near 0.
Tolerance for accepting a new solution to be better than a previous solution in terms of log-likelihood. The default corresponds to a one ten-thousandth unit increase in log-likelihood.
# example code
set.seed(12345)
arima(miHuron_level$Average, order = c(2, 0, 1), max_iters = 100)
Run the code above in your browser using DataLab