Learn R Programming

parsnip (version 0.0.0.9000)

rand_forest: General Interface for Random Forest Models

Description

rand_forest is a way to generate a specification of a model before fitting and allows the model to be created using different packages in R or via Spark. The main arguments for the model are:

  • mtry: The number of predictors that will be randomly sampled at each split when creating the tree models.

  • trees: The number of trees contained in the ensemble.

  • min_n: The minimum number of data points in a node that are required for the node to be split further.

These arguments are converted to their specific names at the time that the model is fit. Other options and argument can be set using the engine_args argument. If left to their defaults here (NULL), the values are taken from the underlying model functions.

Usage

rand_forest(mode = "unknown", mtry = NULL, trees = NULL, min_n = NULL,
  engine_args = list(), ...)

Arguments

mode

A single character string for the type of model. Possible values for this model are "unknown", "regression", or "classification".

mtry

An integer for the number of predictors that will be randomly sampled at each split when creating the tree models.

trees

An integer for the number of trees contained in the ensemble.

min_n

An integer for the minimum number of data points in a node that are required for the node to be split further.

engine_args

A named list of arguments to be used by the underlying models (e.g., ranger::ranger, randomForest::randomForest, etc.). These are not evaluated until the model is fit and will be substituted into the model fit expression.

...

Used for method consistency. Any arguments passed to the ellipses will result in an error. Use engine_args instead.

Details

The data given to the function are not saved and are only used to determine the mode of the model. For rand_forest, the possible modes are "regression" and "classification".

The model can be created using the fit() function using the following engines:

  • R: "ranger" or "randomForests"

  • Spark: "spark"

See Also

varying(), fit()

Examples

Run this code
# NOT RUN {
rand_forest(mode = "classification", trees = 2000)

# Parameters can be represented by a placeholder:
rand_forest(mode = "regression", mtry = varying())
# }

Run the code above in your browser using DataLab