
For a tree in the forest, trainset bias is the prediction of its root node, or the unconditional prediction of the tree, or the average response of the samples used to train the tree.
trainsetBiasTree(tidy.RF, tree)trainsetBias(tidy.RF)
A tidy random forest. The random forest to extract train set bias from.
An integer. The index of the tree to look at.
A matrix. The content depends the type of the response.
Regression: A 1-by-1 matrix. The trainset bias for the prediction of the response.
Classification: A 1-by-D matrix, where D is the number of response classes. Each column of the matrix stands for the trainset bias for the prediction of each response class.
trainsetBiasTree
: Trainset bias within a single tree
trainsetBias
: Trainset bias within the whole forest
For a forest, the trainset bias is simply the average trainset bias across all trees. This is because the prediction of a forest is the average of the predictions of its trees.
Together with featureContrib(Tree)
, they can decompose the prediction
by feature importance:
where MODEL can be either a tree or a forest.
Interpreting random forests http://blog.datadive.net/interpreting-random-forests/
Random forest interpretation with scikit-learn http://blog.datadive.net/random-forest-interpretation-with-scikit-learn/
# NOT RUN {
library(ranger)
rfobj <- ranger(Species ~ ., iris, keep.inbag=TRUE)
tidy.RF <- tidyRF(rfobj, iris[, -5], iris[, 5])
trainsetBiasTree(tidy.RF, 1)
trainsetBias(tidy.RF)
# }
Run the code above in your browser using DataLab