Last chance! 50% off unlimited learning
Sale ends in
AutoCatBoostSizeFreqDist for building size and frequency distributions via quantile regressions. Size (or severity) and frequency (or count) quantile regressions are build. Use this with the AutoQuantileGibbsSampler function to simulate the joint distribution.
AutoCatBoostSizeFreqDist(
CountData = NULL,
SizeData = NULL,
CountQuantiles = seq(0.1, 0.9, 0.1),
SizeQuantiles = seq(0.1, 0.9, 0.1),
AutoTransform = TRUE,
DataPartitionRatios = c(0.75, 0.2, 0.05),
StratifyColumnNames = NULL,
NTrees = 1500,
TaskType = "GPU",
EvalMetric = "Quantile",
GridTune = FALSE,
GridEvalMetric = "mae",
CountTargetColumnName = NULL,
SizeTargetColumnName = NULL,
CountFeatureColNames = NULL,
SizeFeatureColNames = NULL,
CountIDcols = NULL,
SizeIDcols = NULL,
ModelIDs = c("CountModel", "SizeModel"),
MaxModelsGrid = 5,
ModelPath = NULL,
MetaDataPath = NULL,
NumOfParDepPlots = 0
)
This is your CountData generated from the IntermittentDemandBootStrapper() function
This is your SizeData generated from the IntermittentDemandBootStrapper() function
The default are deciles, i.e. seq(0.10,0.90,0.10). More granularity the better, but it will take longer to run.
The default are deciles, i.e. seq(0.10,0.90,0.10). More granularity the better, but it will take longer to run.
Set to FALSE not to have the your target variables automatically transformed for the best normalization.
The default is c(0.75,0.20,0.05). With CatBoost, you should allocate a decent amount to the validation data (second input). Three inputs are required.
Specify grouping variables to stratify by
Default is 1500. If the best model utilizes all trees, you should consider increasing the argument.
The default is set to "GPU". If you do not have a GPU, set it to "CPU".
Set to "Quantile". Alternative quantile methods may become available in the future.
The default is set to FALSE. If you set to TRUE, make sure to specify MaxModelsGrid to a number greater than 1.
The default is set to "mae". Choose from 'poisson', 'mae', 'mape', 'mse', 'msle', 'kl', 'cs', 'r2'.
Column names or column numbers
Column names or column numbers
Column names or column numbers
Column names or column numbers
Column names or column numbers
Column names or column numbers
A two element character vector. E.g. c("CountModel","SizeModel")
Set to a number greater than 1 if GridTune is set to TRUE
This path file is where all your models will be stored. If you leave MetaDataPath NULL, the evaluation metadata will also be stored here. If you leave this NULL, the function will not run.
A separate path to store the model metadata for evaluation.
Set to a number greater than or equal to 1 to see the relationships between your features and targets.
This function does not return anything. It can only store your models and model evaluation metadata to file.
Other Supervised Learning - Compound:
AutoCatBoostHurdleModel()
,
AutoH2oDRFHurdleModel()
,
AutoH2oGBMHurdleModel()
,
AutoH2oGBMSizeFreqDist()
,
AutoXGBoostHurdleModel()
# NOT RUN {
AutoCatBoostSizeFreqDist(
CountData = CountData,
SizeData = SizeData,
CountQuantiles = seq(0.10,0.90,0.10),
SizeQuantiles = seq(0.10,0.90,0.10),
AutoTransform = TRUE,
DataPartitionRatios = c(0.75,0.20,0.05),
StratifyColumnNames = NULL,
NTrees = 1500,
TaskType = "GPU",
EvalMetric = "Quantile",
GridTune = FALSE,
GridEvalMetric = "mae",
CountTargetColumnName = "Counts",
SizeTargetColumnName = "Target_qty",
CountFeatureColNames = 2:ncol(CountData),
SizeFeatureColNames = 2:ncol(SizeData),
CountIDcols = NULL,
SizeIDcols = NULL,
ModelIDs = c("CountModel","SizeModel"),
MaxModelsGrid = 5,
ModelPath = getwd(),
MetaDataPath = paste0(getwd(),"/ModelMetaData"),
NumOfParDepPlots = 1)
# }
Run the code above in your browser using DataLab