- .data
dataframe
- formula
formula
- tune_method
method of tuning. defaults to grid
- event_level
for binary classification, which factor level is the positive class. specify "second" for second level
- n_fold
integer. n folds in resamples
- n_iter
n iterations for tuning (bayes); paramter grid size (grid)
- seed
seed
- save_output
FASLE. If set to TRUE will write the output as an rds file
- parallel
default TRUE; If set to TRUE, will enable parallel processing on resamples for grid tuning
- trees
# Trees (xgboost: nrounds) (type: integer, default: 500L)
- min_n
Minimal Node Size (xgboost: min_child_weight) (type: integer, default: 2L); [typical range: 2-10] Keep small value for highly imbalanced class data where leaf nodes can have smaller size groups. Otherwise increase size to prevent overfitting outliers.
- mtry
# Randomly Selected Predictors; defaults to .75; (xgboost: colsample_bynode) (type: numeric, range 0 - 1) (or type: integer if count = TRUE)
- tree_depth
Tree Depth (xgboost: max_depth) (type: integer, default: 7L); Typical values: 3-10
- learn_rate
Learning Rate (xgboost: eta) (type: double, default: 0.05); Typical values: 0.01-0.3
- loss_reduction
Minimum Loss Reduction (xgboost: gamma) (type: double, default: 1.0); range: 0 to Inf; typical value: 0 - 20 assuming low-mid tree depth
- sample_size
Proportion Observations Sampled (xgboost: subsample) (type: double, default: .75); Typical values: 0.5 - 1
- stop_iter
# Iterations Before Stopping (xgboost: early_stop) (type: integer, default: 15L) only enabled if validation set is provided
- counts
if TRUE specify mtry as an integer number of cols. Default FALSE to specify mtry as fraction of cols from 0 to 1
- tree_method
xgboost tree_method. default is auto. reference: tree method docs
- monotone_constraints
an integer vector with length of the predictor cols, of -1, 1, 0 corresponding to decreasing, increasing, and no constraint respectively for the index of the predictor col. reference: monotonicity docs.
- num_parallel_tree
should be set to the size of the forest being trained. default 1L
- lambda
[default=.5] L2 regularization term on weights. Increasing this value will make model more conservative.
- alpha
[default=.1] L1 regularization term on weights. Increasing this value will make model more conservative.
- scale_pos_weight
[default=1] Control the balance of positive and negative weights, useful for unbalanced classes. if set to TRUE, calculates sum(negative instances) / sum(positive instances). If first level is majority class, use values < 1, otherwise normally values >1 are used to balance the class distribution.
- verbosity
[default=1] Verbosity of printing messages. Valid values are 0 (silent), 1 (warning), 2 (info), 3 (debug).