rule_fit

0th

Percentile

General Interface for RuleFit Models

rule_fit() is a way to generate a specification of a model before fitting. The main arguments for the model are:

  • mtry: The number of predictors that will be randomly sampled at each split when creating the tree models.

  • trees: The number of trees contained in the ensemble.

  • min_n: The minimum number of data points in a node that are required for the node to be split further.

  • tree_depth: The maximum depth of the tree (i.e. number of splits).

  • learn_rate: The rate at which the boosting algorithm adapts from iteration-to-iteration.

  • loss_reduction: The reduction in the loss function required to split further.

  • sample_size: The amount of data exposed to the fitting routine.

These arguments are converted to their specific names at the time that the model is fit. Other options and argument can be set using parsnip::set_engine(). If left to their defaults here (NULL), the values are taken from the underlying model functions. If parameters need to be modified, update() can be used in lieu of recreating the object from scratch.

Usage
rule_fit(
  mode = "unknown",
  mtry = NULL,
  trees = NULL,
  min_n = NULL,
  tree_depth = NULL,
  learn_rate = NULL,
  loss_reduction = NULL,
  sample_size = NULL,
  penalty = NULL
)

# S3 method for rule_fit update( object, parameters = NULL, mtry = NULL, trees = NULL, min_n = NULL, tree_depth = NULL, learn_rate = NULL, loss_reduction = NULL, sample_size = NULL, penalty = NULL, fresh = FALSE, ... )

Arguments
mode

A single character string for the type of model. Possible values for this model are "unknown", "regression", or "classification".

mtry

An number for the number (or proportion) of predictors that will be randomly sampled at each split when creating the tree models.

trees

An integer for the number of trees contained in the ensemble.

min_n

An integer for the minimum number of data points in a node that are required for the node to be split further.

tree_depth

An integer for the maximum depth of the tree (i.e. number of splits).

learn_rate

A number for the rate at which the boosting algorithm adapts from iteration-to-iteration.

loss_reduction

A number for the reduction in the loss function required to split further .

sample_size

An number for the number (or proportion) of data that is exposed to the fitting routine.

penalty

L1 regularization parameter.

object

A rule_fit model specification.

parameters

A 1-row tibble or named list with main parameters to update. If the individual arguments are used, these will supersede the values in parameters. Also, using engine arguments in this object will result in an error.

fresh

A logical for whether the arguments should be modified in-place or replaced wholesale.

...

Not used for update().

Details

The RuleFit model creates a regression model of rules in two stages. The first stage uses a tree-based model that is used to generate a set of rules that can be filtered, modified, and simplified. These rules are then added as predictors to a regularized generalized linear model that can also conduct feature selection during model training.

For the xrf engine, the xgboost package is used to create the rule set that is then added to a glmnet model.

The only available engine is "xrf".

Value

An updated parsnip model specification.

References

Friedman, J. H., and Popescu, B. E. (2008). "Predictive learning via rule ensembles." The Annals ofApplied Statistics, 2(3), 916-954.

See Also

parsnip::fit(), parsnip::fit_xy(), xrf::xrf.formula()

Aliases
  • rule_fit
  • update.rule_fit
Examples
# NOT RUN {
rule_fit()
# Parameters can be represented by a placeholder:
rule_fit(trees = 7)

# ------------------------------------------------------------------------------

set.seed(6907)
rule_fit_rules <-
  rule_fit(trees = 3, penalty = 0.1) %>%
  set_mode("classification") %>%
  fit(Species ~ ., data = iris)

# ------------------------------------------------------------------------------

model <- rule_fit(trees = 10, min_n = 2)
model
update(model, trees = 1)
update(model, trees = 1, fresh = TRUE)
# }
Documentation reproduced from package rules, version 0.0.1, License: MIT + file LICENSE

Community examples

Looks like there are no examples yet.