- d
A data frame from prep_data
. If you want to prepare
your data on your own, use prep_data(..., no_prep = TRUE)
.
- outcome
Optional. Name of the column to predict. When omitted the
outcome from prep_data
is used; otherwise it must match the
outcome provided to prep_data
.
- models
Names of models to try. See get_supported_models
for available models. Default is all available models.
- metric
Which metric should be used to assess model performance?
Options for classification: "ROC" (default) (area under the receiver
operating characteristic curve) or "PR" (area under the precision-recall
curve). Options for regression: "RMSE" (default) (root-mean-squared error,
default), "MAE" (mean-absolute error), or "Rsquared." Options for
multiclass: "Accuracy" (default) or "Kappa" (accuracy, adjusted for class
imbalance).
- positive_class
For classification only, which outcome level is the
"yes" case, i.e. should be associated with high probabilities? Defaults to
"Y" or "yes" if present, otherwise is the first level of the outcome
variable (first alphabetically if the training data outcome was not already
a factor).
- n_folds
How many folds to use in cross-validation? Default = 5.
- tune_depth
How many hyperparameter combinations to try? Default = 10.
Value is multiplied by 5 for regularized regression. Increasing this value
when tuning XGBoost models may be particularly useful for performance.
- hyperparameters
Optional, a list of data frames containing
hyperparameter values to tune over. If NULL (default) a random,
tune_depth
-deep search of the hyperparameter space will be
performed. If provided, this overrides tune_depth. Should be a named list
of data frames where the names of the list correspond to models (e.g. "rf")
and each column in the data frame contains hyperparameter values. See
hyperparameters
for a template. If only one model is
specified to the models
argument, the data frame can be provided
bare to this argument.
- model_class
"regression" or "classification". If not provided, this
will be determined by the class of `outcome` with the determination
displayed in a message.
- model_name
Quoted, name of the model. Defaults to the name of the
outcome variable.
- allow_parallel
Depreciated. Instead, control the number of cores though your
parallel back end (e.g. with doMC
).