Learn R Programming

RemixAutoML (version 0.5.0)

AutoMLTS: AutoMLTS Is an Automated Machine Learning Time Series Forecasting Function

Description

AutoMLTS Is an Automated Machine Learning Time Series Forecasting Function. Create hundreds of thousands of time series forecasts using this function.

Usage

AutoMLTS(data, TargetColumnName = "Target",
  DateColumnName = "DateTime", GroupVariables = NULL,
  FC_Periods = 30, TimeUnit = "week", Lags = c(1:5),
  MA_Periods = c(1:5), CalendarVariables = FALSE,
  TimeTrendVariable = FALSE, DataTruncate = FALSE,
  SplitRatios = c(0.7, 0.2, 0.1), TaskType = "GPU",
  EvalMetric = "MAPE", GridTune = FALSE, GridEvalMetric = "mape",
  ModelCount = 1, ModelType = "catboost", NTrees = 1000,
  PartitionType = "timeseries", Timer = TRUE)

Arguments

data

Supply your full series data set here

TargetColumnName

List the column name of your target variables column. E.g. "Target"

DateColumnName

List the column name of your date column. E.g. "DateTime"

GroupVariables

Defaults to NULL. Use NULL when you have a single series. Add in GroupVariables when you have a series for every level of a group or multiple groups.

FC_Periods

Set the number of periods you want to have forecasts for. E.g. 52 for weekly data to forecast a year ahead

TimeUnit

List the time unit your data is aggregated by. E.g. "hour", "day", "week", "year"

Lags

Select the periods for all lag variables you want to create. E.g. I use this for weekly data c(1:5,52)

MA_Periods

Select the periods for all moving average variables you want to create. E.g. I use this for weekly data c(1:5,52)

CalendarVariables

Set to TRUE to have calendar variables created. The calendar variables are numeric representations of second, minute, hour, week day, month day, year day, week, isoweek, quarter, and year

TimeTrendVariable

Set to TRUE to have a time trend variable added to the model. Time trend is numeric variable indicating the numeric value of each record in the time series (by group). Time trend starts at 1 for the earliest point in time and increments by one for each success time point.

DataTruncate

Set to TRUE to remove records with missing values from the lags and moving average features created

SplitRatios

E.g c(0.7,0.2,0.1) for train, validation, and test sets

TaskType

Default is "GPU" but you can also set it to "CPU"

EvalMetric

Select from "RMSE", "MAE", "MAPE", "Poisson", "Quantile", "LogLinQuantile", "Lq", "NumErrors", "SMAPE", "R2", "MSLE", "MedianAbsoluteError"

GridTune

Set to TRUE to run a grid tune

GridEvalMetric

This is the metric used to find the threshold 'poisson', 'mae', 'mape', 'mse', 'msle', 'kl', 'cs', 'r2'

ModelCount

Set the number of models to try in the grid tune

ModelType

Select from list "catboost"

NTrees

Select the number of trees you want to have built to train the model

PartitionType

Select "random" for random data partitioning "time" for partitioning by time frames

Timer

= TRUE

Value

Returns a data.table of original series and forecasts, the catboost model objects (everything returned from AutoCatBoostRegression()), and a time series forecast plot. The time series forecast plot will plot your single series or aggregate your data to a single series and create a plot from that.

See Also

Other Supervised Learning: AutoCatBoostClassifier, AutoCatBoostMultiClass, AutoCatBoostRegression, AutoCatBoostScoring, AutoH2OMLScoring, AutoH2OModeler, AutoH2OScoring, AutoH2oDRFClassifier, AutoH2oDRFMultiClass, AutoH2oDRFRegression, AutoH2oGBMClassifier, AutoH2oGBMMultiClass, AutoH2oGBMRegression, AutoNLS, AutoRecommenderScoring, AutoRecommender, AutoTS, AutoXGBoostClassifier, AutoXGBoostMultiClass, AutoXGBoostRegression, AutoXGBoostScoring

Examples

Run this code
# NOT RUN {
Results <- AutoMLTS(data,
                    TargetColumnName = "Weekly_Sales",
                    DateColumnName = "Date",
                    GroupVariables = c("Store","Dept"),
                    FC_Periods = 52,
                    TimeUnit = "week",
                    Lags = c(1:5,52),
                    MA_Periods = c(1:5,52),
                    CalendarVariables = TRUE,
                    TimeTrendVariable = TRUE,
                    DataTruncate = FALSE,
                    SplitRatios = c(1-2*30/143,30/143,30/143),
                    TaskType = "GPU",
                    EvalMetric = "MAE",
                    GridTune = FALSE,
                    GridEvalMetric = "mae",
                    ModelCount = 1,
                    ModelType = "catboost",
                    NTrees = 1000,
                    PartitionType = "time")
Results$TimeSeriesPlot
Results$Forecast
Results$ModelInformation$...
# }

Run the code above in your browser using DataLab