xgb_train
is a wrapper for xgboost
tree-based models where all of the
model arguments are in the main function.
xgb_train(
x,
y,
max_depth = 6,
nrounds = 15,
eta = 0.3,
colsample_bynode = NULL,
colsample_bytree = NULL,
min_child_weight = 1,
gamma = 0,
subsample = 1,
validation = 0,
early_stop = NULL,
objective = NULL,
counts = TRUE,
event_level = c("first", "second"),
...
)
A data frame or matrix of predictors
A vector (factor or numeric) or matrix (numeric) of outcome data.
An integer for the maximum depth of the tree.
An integer for the number of boosting iterations.
A numeric value between zero and one to control the learning rate.
Subsampling proportion of columns for each node
within each tree. See the counts
argument below. The default uses all
columns.
Subsampling proportion of columns for each tree.
See the counts
argument below. The default uses all columns.
A numeric value for the minimum sum of instance weights needed in a child to continue to split.
A number for the minimum loss reduction required to make a further partition on a leaf node of the tree
Subsampling proportion of rows. By default, all of the training data are used.
The proportion of the data that are used for performance assessment and potential early stopping.
An integer or NULL
. If not NULL
, it is the number of
training iterations without improvement before stopping. If validation
is
used, performance is base on the validation set; otherwise, the training set
is used.
A single string (or NULL) that defines the loss function that
xgboost
uses to create trees. See xgboost::xgb.train()
for options. If left
NULL, an appropriate loss function is chosen.
A logical. If FALSE
, colsample_bynode
and
colsample_bytree
are both assumed to be proportions of the proportion of
columns affects (instead of counts).
For binary classification, this is a single string of either
"first"
or "second"
to pass along describing which level of the outcome
should be considered the "event".
Other options to pass to xgb.train
.
A fitted xgboost
object.