# xgboost

0th

Percentile

##### eXtreme Gradient Boosting (Tree) library

A simple interface for training xgboost model. Look at xgb.train function for a more advanced interface.

##### Usage
xgboost(data = NULL, label = NULL, missing = NULL, params = list(), nrounds, verbose = 1, print.every.n = 1L, early.stop.round = NULL, maximize = NULL, ...)
##### Arguments
data
takes matrix, dgCMatrix, local data file or xgb.DMatrix.
label
the response variable. User should not set this field, if data is local data file or xgb.DMatrix.
missing
Missing is only used when input is dense matrix, pick a float value that represents missing value. Sometimes a data use 0 or other extreme value to represents missing values.
params
the list of parameters.

Commonly used ones are:

• objective objective function, common ones are
• reg:linear linear regression
• binary:logistic logistic regression for classification

• eta step size of each boosting step
• max.depth maximum depth of the tree
• nthread number of thread used in training, if not set, all threads are used
• Look at xgb.train for a more complete list of parameters or https://github.com/dmlc/xgboost/wiki/Parameters for the full list.

See also demo/ for walkthrough example in R.

nrounds
the max number of iterations
verbose
If 0, xgboost will stay silent. If 1, xgboost will print information of performance. If 2, xgboost will print information of both performance and construction progress information
print.every.n
Print every N progress messages when verbose>0. Default is 1 which means all messages are printed.
early.stop.round
If NULL, the early stopping function is not triggered. If set to an integer k, training with a validation set will stop if the performance keeps getting worse consecutively for k rounds.
maximize
If feval and early.stop.round are set, then maximize must be set as well. maximize=TRUE means the larger the evaluation score the better.
...
other parameters to pass to params.
##### Details

This is the modeling function for Xgboost.

Parallelization is automatically enabled if OpenMP is present.

Number of threads can also be manually specified via nthread parameter.

• xgboost
##### Examples
data(agaricus.train, package='xgboost')
data(agaricus.test, package='xgboost')
train <- agaricus.train
test <- agaricus.test
bst <- xgboost(data = train$data, label = train$label, max.depth = 2,
eta = 1, nthread = 2, nround = 2, objective = "binary:logistic")
pred <- predict(bst, test\$data)