Learn R Programming

RemixAutoML (version 0.11.0)

AutoH2oGBMCARMA: AutoH2oGBMCARMA Automated CatBoost Calendar, Holiday, ARMA, and Trend Variables Forecasting

Description

AutoH2oDRFCARMA Automated CatBoost Calendar, Holiday, ARMA, and Trend Variables Forecasting. Create hundreds of thousands of time series forecasts using this function.

Usage

AutoH2oGBMCARMA(data, TargetColumnName = "Target",
  DateColumnName = "DateTime", GroupVariables = NULL,
  FC_Periods = 30, TimeUnit = "week", TargetTransformation = FALSE,
  Lags = c(1:5), MA_Periods = c(1:5), CalendarVariables = FALSE,
  HolidayVariable = TRUE, TimeTrendVariable = FALSE,
  DataTruncate = FALSE, ZeroPadSeries = NULL, SplitRatios = c(0.7,
  0.2, 0.1), EvalMetric = "MAE", GridTune = FALSE, ModelCount = 1,
  NTrees = 1000, PartitionType = "timeseries", MaxMem = "32G",
  NThreads = max(1, parallel::detectCores() - 2), Timer = TRUE)

Arguments

data

Supply your full series data set here

TargetColumnName

List the column name of your target variables column. E.g. "Target"

DateColumnName

List the column name of your date column. E.g. "DateTime"

GroupVariables

Defaults to NULL. Use NULL when you have a single series. Add in GroupVariables when you have a series for every level of a group or multiple groups.

FC_Periods

Set the number of periods you want to have forecasts for. E.g. 52 for weekly data to forecast a year ahead

TimeUnit

List the time unit your data is aggregated by. E.g. "1min", "5min", "10min", "15min", "30min", "hour", "day", "week", "year"

TargetTransformation

Run AutoTransformationCreate() to find best transformation for the target variable. Tests YeoJohnson, BoxCox, and Asigh (also Asin and Logit for proportion target variables).

Lags

Select the periods for all lag variables you want to create. E.g. c(1:5,52)

MA_Periods

Select the periods for all moving average variables you want to create. E.g. c(1:5,52)

CalendarVariables

Set to TRUE to have calendar variables created. The calendar variables are numeric representations of second, minute, hour, week day, month day, year day, week, isoweek, quarter, and year

HolidayVariable

Set to TRUE to have a holiday counter variable created.

TimeTrendVariable

Set to TRUE to have a time trend variable added to the model. Time trend is numeric variable indicating the numeric value of each record in the time series (by group). Time trend starts at 1 for the earliest point in time and increments by one for each success time point.

DataTruncate

Set to TRUE to remove records with missing values from the lags and moving average features created

ZeroPadSeries

Set to "all", "inner", or NULL. See TimeSeriesFill for explanation

SplitRatios

E.g c(0.7,0.2,0.1) for train, validation, and test sets

EvalMetric

Select from "RMSE", "MAE", "MAPE", "R2", "RMSLE"

GridTune

Set to TRUE to run a grid tune

ModelCount

Set the number of models to try in the grid tune

NTrees

Select the number of trees you want to have built to train the model

PartitionType

Select "random" for random data partitioning "time" for partitioning by time frames

MaxMem

Set to the maximum amount of memory you want to allow for running this function. Default is "32G".

NThreads

Set to the number of threads you want to dedicate to this function.

Timer

Set to FALSE to turn off the updating print statements for progress

Value

Returns a data.table of original series and forecasts, the catboost model objects (everything returned from AutoCatBoostRegression()), a time series forecast plot, and transformation info if you set TargetTransformation to TRUE. The time series forecast plot will plot your single series or aggregate your data to a single series and create a plot from that.

See Also

Other Automated Time Series: AutoCatBoostCARMA, AutoH2oDRFCARMA, AutoTS, AutoXGBoostCARMA

Examples

Run this code
# NOT RUN {
Results <- AutoH2oGBMCARMA(data,
                           TargetColumnName = "Target",
                           DateColumnName = "Date",
                           GroupVariables = c("Store","Dept"),
                           FC_Periods = 52,
                           TimeUnit = "week",
                           TargetTransformation = FALSE,
                           Lags = c(1:5,52),
                           MA_Periods = c(1:5,52),
                           CalendarVariables = TRUE,
                           HolidayVariable = TRUE,
                           TimeTrendVariable = TRUE,
                           DataTruncate = FALSE,
                           ZeroPadSeries = "all",
                           SplitRatios = c(1-2*30/143,30/143,30/143),
                           EvalMetric = "MAE",
                           GridTune = FALSE,
                           ModelCount = 1,
                           NTrees = 1000,
                           PartitionType = "timeseries",
                           MaxMem = "32G",
                           NThreads = max(1, parallel::detectCores() - 2),
                           Timer = TRUE)
Results$TimeSeriesPlot
Results$Forecast
Results$ModelInformation$...
# }

Run the code above in your browser using DataLab