xrf.formula

a formula prescribing features to use in the model. transformation of the response variable is not supported. when using transformations on the input features (not suggested in general) it is suggested to set sparse=F

object

a data frame with columns corresponding to the formula

data

the family of the fitted model. one of 'gaussian', 'binomial', 'multinomial'

family

a list of parameters for xgboost. must supply an nrounds argument

xgb_control

a list of parameters for the glmnet fit. must supply a type.measure and nfolds arguments (for the lambda cv)

glm_control

whether a sparse design matrix should be used

sparse

an xgboost model (of class xgb.Booster) to be used instead of the model that <code>xrf</code> would normally fit

prefit_xgb

if true, the tree derived rules are deoverlapped, in that the deoverlapped rule set contains no overlapped rules

deoverlap

See Friedman &amp; Popescu (2008) for a description of the general RuleFit algorithm.
This method uses XGBoost to fit a tree ensemble, extracts a ruleset as the conjunction of tree
traversals, and fits a sparse linear model to the resulting feature set
(including the original feature set) using glmnet.

An implementation of the RuleFit algorithm as described in Friedman & Popescu
(2008) <doi:10.1214/07-AOAS148>. eXtreme Gradient Boosting ('XGBoost') is used
to build rules, and 'glmnet' is used to fit a sparse linear model on the raw and rule features. The result
is a model that learns similarly to a tree ensemble, while often offering improved interpretability
and achieving improved scoring runtime in live applications. Several algorithms for
reducing rule complexity are provided, most notably hyperrectangle de-overlapping. All algorithms scale to
several million rows and support sparse representations to handle tens of thousands of dimensions.

xrf.formula: Fit an eXtreme RuleFit model

Description

Usage

Arguments

References

Examples