train.xgboost: train.xgboost

Description

This function wraps xgb.train to standardize model training within the traineR framework. It automatically handles preprocessing, parameter configuration, multiclass settings, and metadata generation for predictions.

Usage

train.xgboost(
  formula,
  data,
  nrounds,
  evals = list(),
  custom_metric = NULL,
  verbose = 1,
  print_every_n = 1L,
  early_stopping_rounds = NULL,
  maximize = NULL,
  save_period = NULL,
  save_name = "xgboost.model",
  xgb_model = NULL,
  callbacks = list(),
  eval_metric = NULL,
  extra_params = NULL,
  booster = "gbtree",
  objective = NULL,
  eta = 0.3,
  gamma = 0,
  max_depth = 6,
  min_child_weight = 1,
  subsample = 1,
  colsample_bytree = 1,
  ...
)

Value

An object of class xgb.Booster.prmdt containing:

The trained xgboost model.
Metadata used by traineR for prediction output.

Arguments

formula

A model formula describing the response and predictors.

data

A data frame containing the training data. Internally, it is converted to an xgb.DMatrix.

nrounds

Maximum number of boosting iterations.

evals

A named list of xgb.DMatrix objects for evaluation during training. Defaults to training data if empty.

custom_metric

A custom evaluation function for xgboost.

verbose

Controls verbosity: 0 = silent, 1 = progress printed.

print_every_n

Print evaluation results every print_every_n iterations.

early_stopping_rounds

Number of rounds with no improvement before stopping.

maximize

Logical indicating if the evaluation metric should be maximized.

save_period

Save the model every save_period rounds. Defaults to saving at the end.

save_name

File name for saving the model.

xgb_model

A previously trained xgboost model for continuation.

callbacks

A list of callback functions for xgboost during training.

eval_metric

Evaluation metric for xgboost (e.g., "mlogloss", "rmse").

extra_params

Optional list of additional xgboost parameters.

booster

Booster type: "gbtree" or "gblinear". Default is "gbtree".

objective

Objective function for xgboost. If NULL, it's chosen automatically:

Regression → "reg:squarederror"
Binary classification → "binary:logistic"
Multiclass → "multi:softprob"

eta

Learning rate. Default is 0.3.

gamma

Minimum loss reduction for a split. Default is 0.

max_depth

Maximum depth of trees. Default is 6.

min_child_weight

Minimum sum of instance weight in a child.

subsample

Subsample ratio for training instances. Default is 1.

colsample_bytree

Subsample ratio of columns per tree. Default is 1.

...

Additional arguments for xgb.train.

Description

Usage

Value

Arguments

See Also