pwbart: Predicting new observations with a previously fitted BART model

Description

BART is a Bayesian approach to nonparametric function estimation and inference using a sum of trees. For a continuous response $y$ and a $p-$dimensional vector of predictors $x = (x_1, ..., x_p)'$, BART models $y$ and $x$ using $$y = f(x) + \epsilon,$$ where $f$ is a sum of Bayesian regression trees function and $\epsilon ~ N(0, \sigma^2)$. For a binary response $y$, probit BART models $y$ and $x$ using $$P(Y=1|x)=\Phi[f(x)],$$ where $\Phi$ is the CDF of the standard normal distribution and $f$ is a sum of Bayesian regression trees function. The function pwbart() is inherited from the CRAN R package 'BART'.

Usage

pwbart(
  x.test,
  treedraws,
  rm.const,
  mu = 0,
  mc.cores = 1L,
  transposed = FALSE,
  dodraws = TRUE,
  verbose = FALSE
)

Arguments

x.test

A matrix or a data frame of predictors values for prediction with each row corresponding to an observation and each column corresponding to a predictor.

treedraws

A list which is the $treedraws returned from the function wbart() or pbart().

rm.const

A vector which is the $rm.const returned from the function wbart() or pbart().

Mean to add on to y prediction.

mc.cores

The number of threads to utilize.

transposed

A Boolean argument indicating whether the matrix x.test is transposed. When running pwbart() or mc.pwbart() in parallel, it is more memory-efficient to transpose x.test prior to calling the internal versions of these functions.

dodraws

A Boolean argument indicating whether to return the draws themselves (the default), or whether to return the mean of the draws as specified by dodraws=FALSE.

verbose

A Boolean argument indicating whether any messages are printed out.

Value

Returns the predictions for x.test. If dodraws=TRUE, return a matrix of prediction with each row corresponding to a draw and each column corresponding to a new observation; if dodraws=FALSE, return a vector of predictions which are the mean of the draws.

References

Chipman, H. A., George, E. I. and McCulloch, R. E. (2010). "BART: Bayesian additive regression trees." Ann. Appl. Stat. 4 266--298.

Linero, A. R. (2018). "Bayesian regression trees for high-dimensional prediction and variable selection." J. Amer. Statist. Assoc. 113 626--636.

Luo, C. and Daniels, M. J. (2021) "Variable Selection Using Bayesian Additive Regression Trees." arXiv preprint arXiv:2112.13998.

Rockova V, Saha E (2019). <U+201C>On theory for BART.<U+201D> In The 22nd International Conference on Artificial Intelligence and Statistics (pp. 2839<U+2013>2848). PMLR.

Sparapani, R., Spanbauer, C. and McCulloch, R. (2021). "Nonparametric machine learning and efficient computation with bayesian additive regression trees: the BART R package." J. Stat. Softw. 97 1--66.

Examples

Run this code

# NOT RUN {
 
## simulate data (Scenario C.M.1. in Luo and Daniels (2021))
set.seed(123)
data = mixone(100, 10, 1, FALSE)
## run wbart() function
res = wbart(data$X, data$Y, ntree=10, nskip=100, ndpost=100)
## test pwbart() function
x.test = mixone(5, 10, 1, FALSE)$X
pred = pwbart(x.test, res$treedraws, res$rm.const, mu=mean(data$Y))
# }

Run the code above in your browser using DataLab