forecast: Train an univariate time series forecasting model and make forecasts

Description

This function trains a model from the historical values of a time series using an autoregressive approach: the targets are the historical values and the features of the targets their lagged values. Then, the trained model is used to predict the future values of the series using a recursive strategy.

Usage

forecast(
  timeS,
  h,
  lags = NULL,
  method = "knn",
  param = NULL,
  efa = NULL,
  tuneGrid = NULL,
  preProcess = NULL
)

Value

An S3 object of class utsf, basically a list with, at least, the following components:

ts: The time series being forecast.
features: A data frame with the features of the training set. The column names of the data frame indicate the autoregressive lags.
targets: A vector with the targets of the training set.
lags: An integer vector with the autoregressive lags.
model: The regression model used recursively to make the forecast.
pred: An object of class ts and length h with the forecast.
efa: This component is included if forecast accuracy is estimated. A vector with estimates of forecast accuracy according to different forecast accuracy measures.
tuneGrid: This component is included if the tuneGrid parameter has been used. A data frame in which each row contains estimates of forecast accuracy for a combination of tuning parameters.

Arguments

timeS

A time series of class ts or a numeric vector.

h

A positive integer. Number of values to be forecast into the future, i.e., forecast horizon.

lags

An integer vector, in increasing order, expressing the lags used as autoregressive variables. If the default value (NULL) is provided, a suitable vector is chosen.

method

A string indicating the method used for training and forecasting. Allowed values are:

"knn": k-nearest neighbors (the default)
"lm": linear regression
"rt": regression trees
"mt": model trees
"bagging"
"rf": random forests.

See details for a brief explanation of the models. It is also possible to use your own regression model, in that case a function explaining how to build your model must be provided, see the vignette for further details.

param

A list with parameters for the underlying function that builds the model. If the default value (NULL) is provided, the model is fitted with its default parameters. See details for the functions used to train the models.

efa

It is used to indicate how to estimate the forecast accuracy of the model using the last observations of the time series as test set. If the default value (NULL) is provided, no estimation is done. To specify the size of the test set the evaluation() function must be used.

tuneGrid

A data frame with possible tuning values. The columns are named the same as the tuning parameters. The estimation of forecast accuracy is done as explained for the efa parameter. Rolling or fixed origin evaluation is done according to the value of the efa parameter (fixed if NULL). The best combination of parameters is used to train the model with all the historical values of the time series.

preProcess

A list indicating the preprocessings or transformations. Currently, the length of the list must be 1 (only one preprocessing). If NULL the additive transformation is applied to the series. The element of the list is created with the trend() function.

Details

The functions used to build and train the model are:

KNN: In this case no model is built and the function FNN::knn.reg() is used to predict the future values of the time series.
Linear models: Function stats::lm() to build the model and the method stats::predict.lm() associated with the trained model to forecast the future values of the time series.
Regression trees: Function rpart::rpart() to build the model and the method rpart::predict.rpart() associated with the trained model to forecast the future values of the time series.
Model trees: Function Cubist::cubist() to build the model and the method Cubist::predict.cubist() associated with the trained model to forecast the future values of the time series.
Bagging: Function ipred::bagging() to build the model and the method ipred::predict.regbagg() associated with the trained model to forecast the future values of the time series.
Random forest: Function ranger::ranger() to build the model and the method ranger::predict.ranger() associated with the trained model to forecast the future values of the time series.

Examples

Run this code

## Forecast time series using k-nearest neighbors
f <- forecast(AirPassengers, h = 12, method = "knn")
f$pred
library(ggplot2)
autoplot(f)

## Using k-nearest neighbors changing the default k value
forecast(AirPassengers, h = 12, method = "knn", param = list(k = 5))$pred

## Using your own regression model

# Function to build the regression model
my_knn_model <- function(X, y) {
  structure(list(X = X, y = y), class = "my_knn")
}
# Function to predict a new example
predict.my_knn <- function(object, new_value) {
  FNN::knn.reg(train = object$X, test = new_value, y = object$y)$pred
}
forecast(AirPassengers, h = 12, method = my_knn_model)$pred

## Estimating forecast accuracy of the model
f <- forecast(UKgas, h = 4, lags = 1:4, method = "rf", efa = evaluation("minimum"))
f$efa

## Estimating forecast accuracy of different tuning parameters
f <- forecast(UKgas, h = 4, lags = 1:4, method = "knn", tuneGrid = expand.grid(k = 1:5))
f$tuneGrid

## Forecasting a trending series
# Without any preprocessing or transformation
f <- forecast(airmiles, h = 4, method = "knn", preProcess = list(trend("none")))
autoplot(f)

# Applying the additive transformation (default)
f <- forecast(airmiles, h = 4, method = "knn")
autoplot(f)

Run the code above in your browser using DataLab