- ...
The variable(s) to build features for. A single data frame or
matrix may be provided as well. Missing values are not allowed.
- trees
The number of trees to sample.
- depths
The depths of each tree. By default, these are drawn from a
Poisson distribution calibrated to produce trees with around 2.5 leaves, on
average, matching the traditional BART prior.
- vars
Integer indices of the variables to use for each tree. If
provided, overrides those generated automatically by sampling uniformly
from the available input features. Provided in flat form, so should have
length equal to sum(depths).
- thresh
The thresholds for each variable. If provided, overrides those
generated automatically by sampling uniformly from ranges, which defaults
to the range of each input feature. Provided in flat form, so should have
length equal to sum(depths).
- drop
Columns in the calculated indicator matrix to drop. By default,
any leaves which match zero input rows are dropped. If provided, overrides
this default.
- min_drop
Controls the default dropping of columns. Leaves which match
min_drop or fewer input rows are dropped. Defaults to 0, so only empty
leaves are dropped.
- ranges
The range of the input features, provided as a matrix with two
rows and a column for each input feature. The first row is the minimum and
the second row is the maximum.
- mean_depth
The mean prior depth of each tree, where a single node has
depth zero and a two-leaf tree has depth 1. This value minus one becomes
the rate parameter of a Poisson distribution, whose samples are then
shifted up by one. In this way, no zero-depth trees (which produce trivial
features) are sampled.