bartModelMatrix: Create a matrix out of a vector or data frame

Description

The external BART functions (e.g. wbart()) operate on matrices in memory. Therefore, if the user submits a vector or data frame, then this function converts it to a matrix. Also, it determines the number of cut points necessary for each column when asked to do so. This function is inherited from the CRAN package 'BART'.

Usage

bartModelMatrix(
  X,
  numcut = 0L,
  usequants = FALSE,
  type = 7,
  rm.const = FALSE,
  cont = FALSE,
  xinfo = NULL
)

Arguments

A vector or data frame where the matrix is created.

numcut

The maximum number of cut points to consider. If numcut=0, then return a matrix; otherwise, return a list containing a matrix X, a vector numcut and a list xinfo.

usequants

A Boolean argument indicating the way to generate cut points. If usequants=FALSE, then the cut points in xinfo are generated uniformly; otherwise, the quantiles are used for the cut points.

type

An integer between \(1\) and \(9\) determining which algorithm is employed in the function quantile().

rm.const

A Boolean argument indicating whether to remove constant variables.

cont

A Boolean argument indicating whether to assume all variables are continuous.

xinfo

A list (matrix) where the items (rows) are the predictors and the contents (columns) of the items are the cut points. If xinfo=NULL, BART will choose xinfo for the user.

Value

The function bartModelMatrix() returns a list with the following components.

A matrix with rows corresponding to observations and columns corresponding to predictors (after dummification).

numcut

A vector of ncol(X) integers with each indicating the number of cut points for the corresponding predictor.

rm.const

A vector of indicators for the predictors (after dummification) used in BART; when the indicator is negative, it refers to remove that predictor.

xinfo

A list (matrix) where the items (rows) are the predictors and the contents (columns) of the items are the cut points.

grp

A vector of group indices for predictors. For example, if \(2\) appears \(3\) times in grp, the second predictor of X is a categorical predictor with \(3\) levels.

References

Chipman, H. A., George, E. I. and McCulloch, R. E. (2010). "BART: Bayesian additive regression trees." Ann. Appl. Stat. 4 266--298.

Linero, A. R. (2018). "Bayesian regression trees for high-dimensional prediction and variable selection." J. Amer. Statist. Assoc. 113 626--636.

Luo, C. and Daniels, M. J. (2021) "Variable Selection Using Bayesian Additive Regression Trees." arXiv preprint arXiv:2112.13998.

Rockova V, Saha E (2019). <U+201C>On theory for BART.<U+201D> In The 22nd International Conference on Artificial Intelligence and Statistics (pp. 2839<U+2013>2848). PMLR.

Sparapani, R., Spanbauer, C. and McCulloch, R. (2021). "Nonparametric machine learning and efficient computation with bayesian additive regression trees: the BART R package." J. Stat. Softw. 97 1--66.

Examples

Run this code

# NOT RUN {
 
## simulate data (Scenario C.M.1. in Luo and Daniels (2021))
set.seed(123)
data = mixone(100, 10, 1, FALSE)
## test bartModelMatrix() function
res = bartModelMatrix(data$X, numcut=100, usequants=FALSE, cont=FALSE, rm.const=TRUE)
# }