Learn R Programming

dtGAP (version 0.0.2)

rf_summary: Random Forest Ensemble Summary

Description

Fits a partykit::cforest and displays a multi-panel summary: variable importance barplot, OOB error curve, and optionally a representative tree (the tree with highest prediction agreement with the full ensemble).

Usage

rf_summary(
  x = NULL,
  target_lab = NULL,
  data_train = NULL,
  data_test = NULL,
  data_all = NULL,
  test_size = 0.3,
  task = c("classification", "regression"),
  ntree = 500L,
  mtry = NULL,
  rf_control = NULL,
  show_var_imp = TRUE,
  show_rep_tree = TRUE,
  top_n_vars = 15L,
  total_w = 297,
  total_h = 210
)

Value

A list (invisible) with:

forest

The fitted cforest object.

var_imp

Named numeric vector of variable importance.

rep_tree_index

Index of the representative tree.

Arguments

x

Character. Dataset name/label. If NULL, inferred from data arguments.

target_lab

Character. Name of the target column.

data_train

Data frame. Training data.

data_test

Data frame. Test data.

data_all

Data frame. Full dataset.

test_size

Numeric. Proportion for test split (default 0.3).

task

Character. "classification" or "regression".

ntree

Integer. Number of trees (default 500).

mtry

Integer or NULL. Variables per split.

rf_control

A ctree_control object or NULL.

show_var_imp

Logical. Show variable importance barplot (default TRUE).

show_rep_tree

Logical. Show representative tree info (default TRUE).

top_n_vars

Integer. How many top variables to show (default 15).

total_w

Numeric. Page width in mm (default 297).

total_h

Numeric. Page height in mm (default 210).

Examples

Run this code
# \donttest{
rf_summary(
  data_train = train_covid,
  data_test = test_covid,
  target_lab = "Outcome",
  ntree = 50
)
# }

Run the code above in your browser using DataLab