bartMachine (version 1.2.6)

rmse_by_num_trees: Assess the Out-of-sample RMSE by Number of Trees

Description

Assess out-of-sample RMSE of a BART model for varying numbers of trees in the sum-of-trees model.

Usage

rmse_by_num_trees(bart_machine, tree_list = c(5, seq(10, 50, 10), 100, 150, 200),
in_sample = FALSE, plot = TRUE, holdout_pctg = 0.3, num_replicates = 4, ...)

Value

Invisibly, returns the out-of-sample average RMSEs for each tree size.

Arguments

bart_machine

An object of class ``bartMachine''.

tree_list

List of sizes for the sum-of-trees models.

in_sample

If TRUE, the RMSE is computed on in-sample data rather than an out-of-sample holdout.

plot

If TRUE, a plot of the RMSE by the number of trees in the ensemble is created.

holdout_pctg

Percentage of the data to be treated as an out-of-sample holdout.

num_replicates

Number of replicates to average the results over. Each replicate uses a randomly sampled holdout of the data, (which could have overlap).

...

Other arguments to be passed to the plot function.

Author

Adam Kapelner and Justin Bleich

Examples

Run this code
if (FALSE) {
#generate Friedman data
set.seed(11)
n  = 200 
p = 10
X = data.frame(matrix(runif(n * p), ncol = p))
y = 10 * sin(pi* X[ ,1] * X[,2]) +20 * (X[,3] -.5)^2 + 10 * X[ ,4] + 5 * X[,5] + rnorm(n)

##build BART regression model
bart_machine = bartMachine(X, y, num_trees = 20)

#explore RMSE by number of trees
rmse_by_num_trees(bart_machine)
}

Run the code above in your browser using DataLab