forecastML (version 0.5.0)

return_error: Compute forecast error

Description

Compute forecast error metrics on the validation datasets or a test dataset.

Usage

return_error(data_results, data_test = NULL, test_indices = NULL,
  metrics = c("mae", "mape", "mdape", "smape"), models = NULL,
  horizons = NULL, windows = NULL, group_filter = NULL)

Arguments

data_results

An object of class 'training_results' or 'forecast_results' from running predict on a trained model.

data_test

Required for forecast results only. If data_results is an object of class 'forecast_results', a data.frame used to assess the accuracy of a 'forecast_results' object. data_test should have the outcome/target columns and any grouping columns.

test_indices

Required if data_test is given. A vector or 1-column data.frame of numeric row indices or dates (class 'Date') with length nrow(data_test).

metrics

Common forecast error metrics. See the Error Metrics section below for details. The default behavior is to return all metrics.

models

Optional. A character vector of user-defined model names supplied to train_model().

horizons

Optional. A numeric vector to filter results by horizon.

windows

Optional. A numeric vector to filter results by validation window number.

group_filter

Optional. A string for filtering plot results for grouped time-series (e.g., "group_col_1 == 'A'"). The results are passed to dplyr::filter() internally.

Value

An S3 object of class 'validation_error' or 'forecast_error': A list of data.frames of error metrics for the validation datasets or forecast dataset depending on the data_test argument.

A list containing:

  • Error metrics by horizon + validation window

  • Error metrics by horizon, collapsed across validation windows

  • Global error metrics collapsed across horizons and validation windows

Error Metrics

  • mae: Mean absolute error

  • mape: Mean absolute percentage error

  • mdape: Median absolute percentage error

  • smape: Symmetrical mean absolute percentage error

Methods and related functions

The output of return_error() has the following generic S3 methods

  • plot from return_error()

Examples

Run this code
# NOT RUN {
# Sampled Seatbelts data from the R package datasets.
data("data_seatbelts", package = "forecastML")

# Example - Training data for 2 horizon-specific models w/ common lags per predictor.
horizons <- c(1, 12)
lookback <- 1:15

data_train <- create_lagged_df(data_seatbelts, type = "train", outcome_col = 1,
                               lookback = lookback, horizon = horizons)

windows <- create_windows(data_train, window_length = 12)

# User-define model - LASSO
# A user-defined wrapper function for model training that takes the following
# arguments: (1) a horizon-specific data.frame made with create_lagged_df(..., type = "train")
# (e.g., my_lagged_df$horizon_h) and, optionally, (2) any number of additional named arguments
# which are passed as '...' in train_model().
library(glmnet)
model_function <- function(data, my_outcome_col) {

  x <- data[, -(my_outcome_col), drop = FALSE]
  y <- data[, my_outcome_col, drop = FALSE]
  x <- as.matrix(x, ncol = ncol(x))
  y <- as.matrix(y, ncol = ncol(y))

  model <- glmnet::cv.glmnet(x, y, nfolds = 3)
  return(model)
}

# my_outcome_col = 1 is passed in ... but could have been defined in model_function().
model_results <- train_model(data_train, windows, model_name = "LASSO", model_function,
                             my_outcome_col = 1)

# User-defined prediction function - LASSO
# The predict() wrapper takes two positional arguments. First,
# the returned model from the user-defined modeling function (model_function() above).
# Second, a data.frame of predictors--identical to the datasets returned from
# create_lagged_df(..., type = "train"). The function can return a 1- or 3-column data.frame
# with either (a) point forecasts or (b) point forecasts plus lower and upper forecast
# bounds (column order and column names do not matter).
prediction_function <- function(model, data_features) {

  x <- as.matrix(data_features, ncol = ncol(data_features))

  data_pred <- data.frame("y_pred" = predict(model, x, s = "lambda.min"))
  return(data_pred)
}

# Predict on the validation datasets.
data_valid <- predict(model_results, prediction_function = list(prediction_function),
                      data = data_train)

# Forecast error metrics for validation datasets.
data_error <- return_error(data_valid)
# }

Run the code above in your browser using DataLab