Learn R Programming

RemixAutoML (version 0.11.0)

AutoCatBoostHurdleModel: AutoCatBoostHurdleModel for generalized hurdle modeling

Description

AutoCatBoostHurdleModel for generalized hurdle modeling. Check out the Readme.Rd on github for more background.

Usage

AutoCatBoostHurdleModel(data, ValidationData = NULL, TestData = NULL,
  Buckets = 0, TargetColumnName = NULL, FeatureColNames = NULL,
  PrimaryDateColumn = NULL, IDcols = NULL,
  TransformNumericColumns = NULL, ClassWeights = NULL,
  SplitRatios = c(0.7, 0.2, 0.1), task_type = "GPU",
  ModelID = "ModelTest", Paths = NULL, MetaDataPaths = NULL,
  SaveModelObjects = TRUE, Trees = 1000, GridTune = TRUE,
  MaxModelsInGrid = 1, NumOfParDepPlots = 10, PassInGrid = NULL)

Arguments

data

Source training data. Do not include a column that has the class labels for the buckets as they are created internally.

ValidationData

Source validation data. Do not include a column that has the class labels for the buckets as they are created internally.

TestData

Souce test data. Do not include a column that has the class labels for the buckets as they are created internally.

Buckets

A numeric vector of the buckets used for subsetting the data. NOTE: the final Bucket value will first create a subset of data that is less than the value and a second one thereafter for data greater than the bucket value.

TargetColumnName

Supply the column name or number for the target variable

FeatureColNames

Supply the column names or number of the features (not included the PrimaryDateColumn)

PrimaryDateColumn

Supply a date column if the data is functionally related to it

IDcols

Includes PrimaryDateColumn and any other columns you want returned in the validation data with predictions

TransformNumericColumns

Transform numeric column inside the AutoCatBoostRegression() function

ClassWeights

Utilize these for the classifier model

SplitRatios

Supply vector of partition ratios. For example, c(0.70,0.20,0,10).

task_type

Set to "GPU" or "CPU"

ModelID

Define a character name for your models

Paths

The path to your folder where you want your model information saved

MetaDataPaths

TA character string of your path file to where you want your model evaluation output saved. If left NULL, all output will be saved to Paths.

SaveModelObjects

Set to TRUE to save the model objects to file in the folders listed in Paths

Trees

Default 15000

GridTune

Set to TRUE if you want to grid tune the models

MaxModelsInGrid

Set to a numeric value for the number of models to try in grid tune

NumOfParDepPlots

Set to pull back N number of partial dependence calibration plots.

PassInGrid

Pass in a grid for changing up the parameter settings for catboost

Value

Returns AutoCatBoostRegression() model objects: VariableImportance.csv, Model, ValidationData.csv, EvalutionPlot.png, EvalutionBoxPlot.png, EvaluationMetrics.csv, ParDepPlots.R a named list of features with partial dependence calibration plots, ParDepBoxPlots.R, GridCollect, and catboostgrid

See Also

Other Automated Regression: AutoCatBoostRegression, AutoH2oDRFHurdleModel, AutoH2oDRFRegression, AutoH2oGBMHurdleModel, AutoH2oGBMRegression, AutoNLS, AutoXGBoostHurdleModel, AutoXGBoostRegression

Examples

Run this code
# NOT RUN {
Output <- RemixAutoML::AutoCatBoostHurdleModel(
  data,
  ValidationData = NULL,
  TestData = NULL,
  Buckets = 1,
  TargetColumnName = "Target_Variable",
  FeatureColNames = 4:ncol(data),
  PrimaryDateColumn = "Date",
  IDcols = 1:3,
  TransformNumericColumns = NULL,
  ClassWeights = NULL,
  SplitRatios = c(0.7, 0.2, 0.1),
  task_type = "GPU",
  ModelID = "ModelID",
  Paths = NULL,
  MetaDataPaths = NULL,
  SaveModelObjects = TRUE,
  Trees = 1000,
  GridTune = FALSE,
  MaxModelsInGrid = 1,
  NumOfParDepPlots = 10,
  PassInGrid = NULL)
# }

Run the code above in your browser using DataLab