The external BART functions (e.g. wbart()
) operate on matrices in memory. Therefore, if the user submits a vector or
data frame, then this function converts it to a matrix. Also, it determines the number of cut points necessary for each column
when asked to do so. This function is inherited from the CRAN package 'BART'.
bartModelMatrix(
X,
numcut = 0L,
usequants = FALSE,
type = 7,
rm.const = FALSE,
cont = FALSE,
xinfo = NULL
)
A vector or data frame where the matrix is created.
The maximum number of cut points to consider. If numcut=0
, then return a matrix; otherwise, return a list
containing a matrix X
, a vector numcut
and a list xinfo
.
A Boolean argument indicating the way to generate cut points. If usequants=FALSE
, then the cut points
in xinfo
are generated uniformly; otherwise, the quantiles are used for the cut points.
An integer between \(1\) and \(9\) determining which algorithm is employed in the function quantile()
.
A Boolean argument indicating whether to remove constant variables.
A Boolean argument indicating whether to assume all variables are continuous.
A list (matrix) where the items (rows) are the predictors and the contents (columns) of the items are the cut points.
If xinfo=NULL
, BART will choose xinfo
for the user.
The function bartModelMatrix()
returns a list with the following components.
A matrix with rows corresponding to observations and columns corresponding to predictors (after dummification).
A vector of ncol(X)
integers with each indicating the number of cut points for the corresponding predictor.
A vector of indicators for the predictors (after dummification) used in BART; when the indicator is negative, it refers to remove that predictor.
A list (matrix) where the items (rows) are the predictors and the contents (columns) of the items are the cut points.
A vector of group indices for predictors. For example, if \(2\) appears \(3\) times in grp
, the second
predictor of X
is a categorical predictor with \(3\) levels.
Chipman, H. A., George, E. I. and McCulloch, R. E. (2010). "BART: Bayesian additive regression trees." Ann. Appl. Stat. 4 266--298.
Linero, A. R. (2018). "Bayesian regression trees for high-dimensional prediction and variable selection." J. Amer. Statist. Assoc. 113 626--636.
Luo, C. and Daniels, M. J. (2021) "Variable Selection Using Bayesian Additive Regression Trees." arXiv preprint arXiv:2112.13998.
Rockova V, Saha E (2019). <U+201C>On theory for BART.<U+201D> In The 22nd International Conference on Artificial Intelligence and Statistics (pp. 2839<U+2013>2848). PMLR.
Sparapani, R., Spanbauer, C. and McCulloch, R. (2021). "Nonparametric machine learning and efficient computation with bayesian additive regression trees: the BART R package." J. Stat. Softw. 97 1--66.
# NOT RUN {
## simulate data (Scenario C.M.1. in Luo and Daniels (2021))
set.seed(123)
data = mixone(100, 10, 1, FALSE)
## test bartModelMatrix() function
res = bartModelMatrix(data$X, numcut=100, usequants=FALSE, cont=FALSE, rm.const=TRUE)
# }
Run the code above in your browser using DataLab