ARDL (version 0.1.0)

auto_ardl: Automatic ARDL model selection

Description

It searches for the best ARDL order specification, according to the selected criterion, taking into account the constraints provided.

Usage

auto_ardl(
  formula,
  data,
  max_order,
  fixed_order = -1,
  starting_order = NULL,
  selection = "AIC",
  selection_minmax = c("min", "max"),
  grid = FALSE,
  search_type = c("horizontal", "vertical"),
  start = NULL,
  end = NULL,
  ...
)

Arguments

formula

A "formula" describing the linear model. Details for model specification are given under 'Details' in the help file of the ardl function.

data

A time series object (e.g., "ts", "zoo" or "zooreg") or a data frame containing the variables in the model. In the case of a data frame, it is coerced into a ts object with start = 1, end = nrow(data) and frequency = 1. If not found in data, the variables are NOT taken from any environment.

max_order

It sets the maximum order for each variable where the search is taking place. A numeric vector of the same length as the total number of variables (excluding the fixed ones, see 'Details' in the help file of the ardl function). It should only contain positive integers. An integer could be provided if the maximum order for all variables is the same.

fixed_order

It allows setting a fixed order for some variables. The algorithm will not search for any other order than this. A numeric vector of the same length as the total number of variables (excluding the fixed ones). It should contain positive integers or 0 to set as a constraint. A -1 should be provided for any variable that should not be constrained. fixed_order overrides the corresponding max_order and starting_order.

starting_order

Specifies the order for each variable from which each search will start. It is a numeric vector of the same length as the total number of variables (excluding the fixed ones). It should contain positive integers or 0 or only one integer could be provided if the starting order for all variables is the same. Default is set to NULL. If unspecified (NULL) and grid = FALSE, then all possible \(VAR(p)\) models are calculated (constraints are taken into account), where \(p\) is the minimum value in max_order. Note that where starting_order is provided, its first element will be the minimum value of \(p\) that the searching algorithm will consider (think of it like a 'minimum p order' restriction) (see 'Searching algorithm' below). If grid = TRUE, only the first argument (\(p\)) will have an effect.

selection

A character string specifying the selection criterion according to which the candidate models will be ranked. Default is AIC. Any other selection criterion can be used (a user specified or a function from another package) as long as it can be applied as selection(model). The preferred model is the one with the smaller value of the selection criterion. If the selection criterion works the other way around (the bigger the better), selection_minmax = "max" should also be supplied (see 'Examples' below).

selection_minmax

A character string that indicates whether the criterion in selection is supposed to be minimized (default) or maximized.

grid

If FALSE (default), the stepwise searching regression algorithm will search for the best model by adding and subtracting terms corresponding to different ARDL orders. If TRUE, the whole set of all possible ARDL models (accounting for constraints) will be evaluated. Note that this method can be very time-consuming in case that max_order is big and there are many independent variables that create a very big number of possible combinations.

search_type

A character string describing the search type. If "horizontal" (default), the searching algorithm increases or decreases by 1 the order of each variable in each iteration. When the order of the last variable has been accessed, it begins again from the first variable until it converges. If "vertical", the searching algorithm increases or decreases by 1 the order of a variable until it converges. Then it continues the same for the next variable. The two options result to very similar top orders. The default ("horizontal"), sometimes is a little more accurate, but the "vertical" is almost 2 times faster. Not applicable if grid = TRUE.

start

Start of the time period which should be used for fitting the model.

end

End of the time period which should be used for fitting the model.

...

Additional arguments to be passed to the low level regression fitting functions.

Value

auto_ardl returns a list which contains:

best_model

An object of class c("dynlm", "lm", "ardl")

best_order

A numeric vector with the order of the best model selected

top_orders

A data.frame with the orders of the top 20 models

Searching algorithm

The algorithm performs the optimization process starting from multiple starting points concerning the autoregressive order \(p\). The searching algorithm will perform a complete search, each time starting from a different starting order. These orders are presented in the tables below, for grid = FALSE and different values of starting_order.

starting_order = NULL:

VAR(p) -> p q1 q2 ... qk
VAR(1) -> 1 1 1 ... 1
VAR(2) -> 2 2 2 ... 2
: -> : : : : :

starting_order = c(3, 0, 1, 2):

p q1 q2 q3
3 0 1 2
4 0 1 2
: : : :

See Also

ardl

Examples

Run this code
# NOT RUN {
data(denmark)

## Find the best ARDL order --------------------------------------------

# Up to 5 for the autoregressive order (p) and 4 for the rest (q1, q2, q3)

# Using the defaults search_type = "vertical", grid = FALSE and selection = "AIC"
# ("Not run" indications only for testing purposes)
# }
# NOT RUN {
model1 <- auto_ardl(LRM ~ LRY + IBO + IDE, data = denmark,
                    max_order = c(5,4,4,4))
model1$top_orders

## Same, with search_type = "horizontal" -------------------------------

model1_h <- auto_ardl(LRM ~ LRY + IBO + IDE, data = denmark,
                      max_order = c(5,4,4,4), search_type = "horizontal")
model1_h$top_orders

## Find the global optimum ARDL order ----------------------------------

# It may take more than 10 seconds
model_grid <- auto_ardl(LRM ~ LRY + IBO + IDE, data = denmark,
                        max_order = c(5,4,4,4), grid = TRUE)

## Different selection criteria ----------------------------------------

# Using BIC as selection criterion instead of AIC
model1_b <- auto_ardl(LRM ~ LRY + IBO + IDE, data = denmark,
                      max_order = c(5,4,4,4), selection = "BIC")
model1_b$top_orders

# Using other criteria like adjusted R squared (the bigger the better)
adjr2 <- function(x) { summary(x)$adj.r.squared }
model1_adjr2 <- auto_ardl(LRM ~ LRY + IBO + IDE, data = denmark,
                           max_order = c(5,4,4,4), selection = "adjr2",
                           selection_minmax = "max")
model1_adjr2$top_orders

# Using functions from other packages as selection criteria
if (requireNamespace("qpcR", quietly = TRUE)) {

library(qpcR)
model1_aicc <- auto_ardl(LRM ~ LRY + IBO + IDE, data = denmark,
                          max_order = c(5,4,4,4), selection = "AICc")
model1_aicc$top_orders
adjr2 <- function(x){ Rsq.ad(x) }
model1_adjr2 <- auto_ardl(LRM ~ LRY + IBO + IDE, data = denmark,
                           max_order = c(5,4,4,4), selection = "adjr2",
                           selection_minmax = "max")
model1_adjr2$top_orders

## DIfferent starting order --------------------------------------------

# The searching algorithm will start from the following starting orders:
# p q1 q2 q3
# 1 1  3  2
# 2 1  3  2
# 3 1  3  2
# 4 1  3  2
# 5 1  3  2

model1_so <- auto_ardl(LRM ~ LRY + IBO + IDE, data = denmark,
                        max_order = c(5,4,4,4), starting_order = c(1,1,3,2))

# Starting from p=3 (don't search for p=1 and p=2)
# Starting orders:
# p q1 q2 q3
# 3 1  3  2
# 4 1  3  2
# 5 1  3  2

model1_so_3 <- auto_ardl(LRM ~ LRY + IBO + IDE, data = denmark,
                        max_order = c(5,4,4,4), starting_order = c(3,1,3,2))

# If starting_order = NULL, the starting orders for each iteration will be:
# p q1 q2 q3
# 1 1  1  1
# 2 2  2  2
# 3 3  3  3
# 4 4  4  4
# 5 5  5  5
}

## Add constraints -----------------------------------------------------

# Restrict only the order of IBO to be 2
model1_ibo2 <- auto_ardl(LRM ~ LRY + IBO + IDE, data = denmark,
                        max_order = c(5,4,4,4), fixed_order = c(-1,-1,2,-1))
model1_ibo2$top_orders

# Restrict the order of LRM to be 3 and the order of IBO to be 2
model1_lrm3_ibo2 <- auto_ardl(LRM ~ LRY + IBO + IDE, data = denmark,
                        max_order = c(5,4,4,4), fixed_order = c(3,-1,2,-1))
model1_lrm3_ibo2$top_orders

## Set the starting date for the regression (data starts at "1974 Q1") -

# Set regression starting date to "1976 Q1"
model1_76q1 <- auto_ardl(LRM ~ LRY + IBO + IDE, data = denmark,
                        max_order = c(5,4,4,4), start = "1976 Q1")
start(model1_76q1$best_model)
# }

Run the code above in your browser using DataCamp Workspace