Returns best ARIMA model according to either AIC, AICc or BIC value. The function conducts a search over possible model within the order constraints provided.
auto.arima( y, d = NA, D = NA, max.p = 5, max.q = 5, max.P = 2, max.Q = 2, max.order = 5, max.d = 2, max.D = 1, start.p = 2, start.q = 2, start.P = 1, start.Q = 1, stationary = FALSE, seasonal = TRUE, ic = c("aicc", "aic", "bic"), stepwise = TRUE, nmodels = 94, trace = FALSE, approximation = (length(x) > 150 | frequency(x) > 12), method = NULL, truncate = NULL, xreg = NULL, test = c("kpss", "adf", "pp"), test.args = list(), seasonal.test = c("seas", "ocsb", "hegy", "ch"), seasonal.test.args = list(), allowdrift = TRUE, allowmean = TRUE, lambda = NULL, biasadj = FALSE, parallel = FALSE, num.cores = 2, x = y, ... )
a univariate time series
Order of first-differencing. If missing, will choose a value based
Order of seasonal-differencing. If missing, will choose a value
Maximum value of p
Maximum value of q
Maximum value of P
Maximum value of Q
Maximum value of p+q+P+Q if model selection is not stepwise.
Maximum number of non-seasonal differences
Maximum number of seasonal differences
Starting value of p in stepwise procedure.
Starting value of q in stepwise procedure.
Starting value of P in stepwise procedure.
Starting value of Q in stepwise procedure.
TRUE, restricts search to stationary models.
FALSE, restricts search to non-seasonal models.
Information criterion to be used in model selection.
TRUE, will do stepwise selection (faster).
Otherwise, it searches over all models. Non-stepwise selection can be very
slow, especially for seasonal models.
Maximum number of models considered in the stepwise search.
TRUE, the list of ARIMA models considered will be
TRUE, estimation is via conditional sums of
squares and the information criteria used for model selection are
approximated. The final model is still computed using maximum likelihood
estimation. Approximation should be used for long time series or a high
seasonal period to avoid excessive computation times.
fitting method: maximum likelihood or minimize conditional sum-of-squares. The default (unless there are missing values) is to use conditional-sum-of-squares to find starting values, then maximum likelihood. Can be abbreviated.
An integer value indicating how many observations to use in
model selection. The last
truncate values of the series are used to
select a model when
truncate is not
approximation=TRUE. All observations are used if either
Optionally, a numerical vector or matrix of external regressors, which
must have the same number of rows as
y. (It should not be a data frame.)
Type of unit root test to use. See
Additional arguments to be passed to the unit root test.
This determines which method is used to select the number of seasonal differences. The default method is to use a measure of seasonal strength computed from an STL decomposition. Other possibilities involve seasonal unit root tests.
Additional arguments to be passed to the seasonal
unit root test.
nsdiffs for details.
TRUE, models with drift terms are considered.
TRUE, models with a non-zero mean are considered.
Box-Cox transformation parameter. If
then a transformation is automatically selected using
The transformation is ignored if NULL. Otherwise,
data transformed before model is estimated.
Use adjusted back-transformed mean for Box-Cox transformations. If transformed data is used to produce forecasts and fitted values, a regular back transformation will result in median forecasts. If biasadj is TRUE, an adjustment will be made to produce mean forecasts and fitted values.
stepwise = FALSE, then the
specification search is done in parallel. This can give a significant
speedup on multicore machines.
Allows the user to specify the amount of parallel processes
to be used if
parallel = TRUE and
stepwise = FALSE. If
NULL, then the number of logical cores is automatically detected and
all available cores are used.
Deprecated. Included for backwards compatibility.
Additional arguments to be passed to
Same as for
The default arguments are designed for rapid estimation of models for many time series.
If you are analysing just one time series, and can afford to take some more time, it
is recommended that you set
Non-stepwise selection can be slow, especially for seasonal data. The stepwise algorithm outlined in Hyndman & Khandakar (2008) is used except that the default method for selecting seasonal differences is now based on an estimate of seasonal strength (Wang, Smith & Hyndman, 2006) rather than the Canova-Hansen test. There are also some other minor variations to the algorithm described in Hyndman and Khandakar (2008).
Hyndman, RJ and Khandakar, Y (2008) "Automatic time series forecasting: The forecast package for R", Journal of Statistical Software, 26(3).
Wang, X, Smith, KA, Hyndman, RJ (2006) "Characteristic-based clustering for time series data", Data Mining and Knowledge Discovery, 13(3), 335-364.