
AutoH2oDRFCARMA Automated CatBoost Calendar, Holiday, ARMA, and Trend Variables Forecasting. Create hundreds of thousands of time series forecasts using this function.
AutoH2oDRFCARMA(data, TargetColumnName = "Target",
DateColumnName = "DateTime", GroupVariables = NULL,
FC_Periods = 30, TimeUnit = "week", TargetTransformation = FALSE,
Lags = c(1:5), MA_Periods = c(1:5), CalendarVariables = FALSE,
HolidayVariable = TRUE, TimeTrendVariable = FALSE,
DataTruncate = FALSE, ZeroPadSeries = NULL, SplitRatios = c(0.7,
0.2, 0.1), EvalMetric = "MAE", GridTune = FALSE, ModelCount = 1,
NTrees = 1000, PartitionType = "timeseries", MaxMem = "32G",
NThreads = max(1, parallel::detectCores() - 2), Timer = TRUE)
Supply your full series data set here
List the column name of your target variables column. E.g. "Target"
List the column name of your date column. E.g. "DateTime"
Defaults to NULL. Use NULL when you have a single series. Add in GroupVariables when you have a series for every level of a group or multiple groups.
Set the number of periods you want to have forecasts for. E.g. 52 for weekly data to forecast a year ahead
List the time unit your data is aggregated by. E.g. "1min", "5min", "10min", "15min", "30min", "hour", "day", "week", "year"
Run AutoTransformationCreate() to find best transformation for the target variable. Tests YeoJohnson, BoxCox, and Asigh (also Asin and Logit for proportion target variables).
Select the periods for all lag variables you want to create. E.g. c(1:5,52)
Select the periods for all moving average variables you want to create. E.g. c(1:5,52)
Set to TRUE to have calendar variables created. The calendar variables are numeric representations of second, minute, hour, week day, month day, year day, week, isoweek, quarter, and year
Set to TRUE to have a holiday counter variable created.
Set to TRUE to have a time trend variable added to the model. Time trend is numeric variable indicating the numeric value of each record in the time series (by group). Time trend starts at 1 for the earliest point in time and increments by one for each success time point.
Set to TRUE to remove records with missing values from the lags and moving average features created
Set to "all", "inner", or NULL. See TimeSeriesFill for explanation
E.g c(0.7,0.2,0.1) for train, validation, and test sets
Select from "RMSE", "MAE", "MAPE", "R2", "RMSLE"
Set to TRUE to run a grid tune
Set the number of models to try in the grid tune
Select the number of trees you want to have built to train the model
Select "random" for random data partitioning "time" for partitioning by time frames
Set to the maximum amount of memory you want to allow for running this function. Default is "32G".
Set to the number of threads you want to dedicate to this function.
Set to FALSE to turn off the updating print statements for progress
Returns a data.table of original series and forecasts, the catboost model objects (everything returned from AutoCatBoostRegression()), a time series forecast plot, and transformation info if you set TargetTransformation to TRUE. The time series forecast plot will plot your single series or aggregate your data to a single series and create a plot from that.
Other Automated Time Series: AutoCatBoostCARMA
,
AutoH2oGBMCARMA
, AutoTS
,
AutoXGBoostCARMA
# NOT RUN {
Results <- AutoH2oDRFCARMA(data,
TargetColumnName = "Target",
DateColumnName = "Date",
GroupVariables = c("Store","Dept"),
FC_Periods = 52,
TimeUnit = "week",
TargetTransformation = FALSE,
Lags = c(1:5,52),
MA_Periods = c(1:5,52),
CalendarVariables = TRUE,
HolidayVariable = TRUE,
TimeTrendVariable = TRUE,
DataTruncate = FALSE,
ZeroPadSeries = "all",
SplitRatios = c(1-2*30/143,30/143,30/143),
EvalMetric = "MAE",
GridTune = FALSE,
ModelCount = 1,
NTrees = 1000,
PartitionType = "timeseries",
MaxMem = "32G",
NThreads = max(1, parallel::detectCores() - 2),
Timer = TRUE)
Results$TimeSeriesPlot
Results$Forecast
Results$ModelInformation$...
# }
Run the code above in your browser using DataLab