trainsetBiasTree: Trainset Bias

Description

For a tree in the forest, trainset bias is the prediction of its root node, or the unconditional prediction of the tree, or the average response of the samples used to train the tree.

Usage

trainsetBiasTree(tidy.RF, tree)
trainsetBias(tidy.RF)

Arguments

tidy.RF

A tidy random forest. The random forest to extract train set bias from.

tree

An integer. The index of the tree to look at.

Value

A matrix. The content depends the type of the response.

Regression: A 1-by-1 matrix. The trainset bias for the prediction of the response.
Classification: A 1-by-D matrix, where D is the number of response classes. Each column of the matrix stands for the trainset bias for the prediction of each response class.

Functions

trainsetBiasTree: Trainset bias within a single tree
trainsetBias: Trainset bias within the whole forest

Details

For a forest, the trainset bias is simply the average trainset bias across all trees. This is because the prediction of a forest is the average of the predictions of its trees.

Together with featureContrib(Tree), they can decompose the prediction by feature importance:

$p r e d i c t i o n (M O D E L, X) = t r a i n s e t B i a s (M O D E L) + f e a t u r e C o n t r i b_{1} (M O D E L, X) + . . . + f e a t u r e C o n t r i b_{p} (M O D E L, X),$

where MODEL can be either a tree or a forest.

References

Interpreting random forests http://blog.datadive.net/interpreting-random-forests/

Random forest interpretation with scikit-learn http://blog.datadive.net/random-forest-interpretation-with-scikit-learn/

Examples

Run this code

# NOT RUN {
library(ranger)
rfobj <- ranger(Species ~ ., iris, keep.inbag=TRUE)
tidy.RF <- tidyRF(rfobj, iris[, -5], iris[, 5])
trainsetBiasTree(tidy.RF, 1)
trainsetBias(tidy.RF)

# }

Run the code above in your browser using DataLab

State of Data and AI Literacy Report 2025