xgboost (version 0.4-3)

xgboost: eXtreme Gradient Boosting (Tree) library

Description

A simple interface for training xgboost model. Look at xgb.train function for a more advanced interface.

Usage

xgboost(data = NULL, label = NULL, missing = NULL, params = list(),
  nrounds, verbose = 1, print.every.n = 1L, early.stop.round = NULL,
  maximize = NULL, ...)

Arguments

data
takes matrix, dgCMatrix, local data file or xgb.DMatrix.
label
the response variable. User should not set this field, if data is local data file or xgb.DMatrix.
missing
Missing is only used when input is dense matrix, pick a float value that represents missing value. Sometimes a data use 0 or other extreme value to represents missing values.
params
the list of parameters.

Commonly used ones are:

  • objectiveobjective function, common ones are
    • reg:linearlinear regression
    • binary:logisticlogistic regression for classification

nrounds
the max number of iterations
verbose
If 0, xgboost will stay silent. If 1, xgboost will print information of performance. If 2, xgboost will print information of both performance and construction progress information
print.every.n
Print every N progress messages when verbose>0. Default is 1 which means all messages are printed.
early.stop.round
If NULL, the early stopping function is not triggered. If set to an integer k, training with a validation set will stop if the performance keeps getting worse consecutively for k rounds.
maximize
If feval and early.stop.round are set, then maximize must be set as well. maximize=TRUE means the larger the evaluation score the better.
...
other parameters to pass to params.

Details

This is the modeling function for Xgboost.

Parallelization is automatically enabled if OpenMP is present.

Number of threads can also be manually specified via nthread parameter.

Examples

Run this code
data(agaricus.train, package='xgboost')
data(agaricus.test, package='xgboost')
train <- agaricus.train
test <- agaricus.test
bst <- xgboost(data = train$data, label = train$label, max.depth = 2,
               eta = 1, nthread = 2, nround = 2, objective = "binary:logistic")
pred <- predict(bst, test$data)

Run the code above in your browser using DataCamp Workspace