Learn R Programming

RemixAutoML (version 0.11.0)

IntermittentDemandDataGenerator: IntermittentDemandDataGenerator for frequency and size data sets

Description

IntermittentDemandDataGenerator for frequency and size data sets. This function generates count and size data sets for various future window sizes.

Usage

IntermittentDemandDataGenerator(data, FC_Periods = 52,
  SaveData = FALSE, FilePath = NULL, TargetVariableName = "qty",
  DateVariableName = "date", GroupingVariables = "sku",
  MinTimeWindow = 1, MinTxnRecords = 2, Lags = 1:7,
  MovingAverages = seq(7, 28, 7), TimeTrendVariable = TRUE,
  TimeUnit = "day", CalendarVariables = c("wday", "mday", "yday",
  "week", "isoweek", "month", "quarter", "year"),
  HolidayGroups = "USPublicHolidays", SampleRate = 0.5,
  PrintSteps = TRUE)

Arguments

data

This is your transactional level data

FC_Periods

The number of future periods to collect data on

SaveData

Set to TRUE to save the MetaData and final modeling data sets to file

FilePath

Set to your file of choice for where you want the data sets saved

TargetVariableName

The name of your target variable that represents demand

DateVariableName

The date variable of the demand instances

GroupingVariables

These variables (or sinlge variable) is the combination of categorical variables that uniquely defines the level of granularity of each individual level to forecast. E.g. "sku" or c("Store","Department"). Sku is typically unique for all sku's. Store and Department in combination defines all unique departments as the department may be repeated across the stores.

MinTimeWindow

The number of time periods you would like to omit for training. Default is 1 so that at a minimum, there is at least one period of values to forecast. You can set it up to a larger value if you do not want more possible target windows for the lower target window values.

MinTxnRecords

I typically set this to 2 so that there is at least one other instance of demand so that the forecasted values are not complete nonsense.

Lags

Select the periods for all lag variables you want to create. E.g. c(1:5,52)

MovingAverages

Select the periods for all moving average variables you want to create. E.g. c(1:5,52)

TimeTrendVariable

Set to TRUE to have a time trend variable added to the model. Time trend is numeric variable indicating the numeric value of each record in the time series (by group). Time trend starts at 1 for the earliest point in time and increments by one for each success time point.

TimeUnit

List the time unit your data is aggregated by. E.g. "1min", "5min", "10min", "15min", "30min", "hour", "day", "week", "month", "quarter", "year"

CalendarVariables

Set to TRUE to have calendar variables created. The calendar variables are numeric representations of second, minute, hour, week day, month day, year day, week, isoweek, quarter, and year

HolidayGroups

Input the holiday groups of your choice from the CreateHolidayVariable() function in this package

SampleRate

Set this to a value greater than 0. The calculation used is the number of records per group level raised to the power of SampleRate.

PrintSteps

Set to TRUE to have operation steps printed to the console

Value

Returns two data.table data sets: The first is a modeling data set for the count distribution while the second data set if for the size model data set.

See Also

Other Feature Engineering: AutoDataPartition, AutoTransformationCreate, AutoTransformationScore, AutoWord2VecModeler, CreateCalendarVariables, CreateHolidayVariables, DT_GDL_Feature_Engineering, DummifyDT, GDL_Feature_Engineering, ModelDataPrep, Partial_DT_GDL_Feature_Engineering, Scoring_GDL_Feature_Engineering, TimeSeriesFill

Examples

Run this code
# NOT RUN {
DataSets <- IntermittentDemandDataGenerator(data,
                                            FC_Periods = 52,
                                            SaveData = FALSE,
                                            FilePath = NULL,
                                            TargetVariableName = "qty",
                                            DateVariableName = "date",
                                            GroupingVariables = "sku",
                                            MinTimeWindow = 1,
                                            MinTxnRecords = 2,
                                            Lags = 1:7,
                                            MovingAverages = seq(7,28,7),
                                            TimeTrendVariable = TRUE,
                                            TimeUnit = "day",
                                            CalendarVariables = c("wday",
                                                                  "mday",
                                                                  "yday",
                                                                  "week",
                                                                  "isoweek",
                                                                  "month",
                                                                  "quarter",
                                                                  "year"),
                                            HolidayGroups = "USPublicHolidays",                  
                                            SampleRate = 0.50)
CountModelData <- DataSets$CountModelData
SizeModelData <- DataSets$SizeModelData
rm(DataSets)
# }

Run the code above in your browser using DataLab