Learn R Programming

BartMixVs (version 1.0.0)

mixtwo: Generate data with correlated and mixed-type predictors

Description

Generate data including responses and predictors values, of which predictors are correlated and of mixed types.

Usage

mixtwo(n, sigma, binary)

Arguments

n

The number of observations.

sigma

The error variance.

binary

A boolean argument: binary = TRUE indicates that binary responses are generated and binary = FALSE indicates that continuous responses are generated.

Value

Return a list with the following components.

X

An n by p data frame representing predictors values, with each row corresponding an observation.

Y

A vector of length n representing response values.

f0

A vector of length n representing the values of \(f0(x)\).

sigma

The error variance which is only returned when binary = FALSE.

prob

A vector of length n representing the values of \(\Phi(f0(x))\), which is only returned when binary = TRUE.

Details

Sample the predictors \(x_1, ..., x_{20}\) from Bernoulli(0.2) independently, \(x_{21}, ..., x_{40}\) from Bernoulli(0.5) independently, and \(x_{41}, ..., x_{84}\) from a multivariate normal distribution with mean 0, variance 1 and correlation 0.3. If binary = FALSE, sample the continuous response \(y\) from Normal(\(f0(x), \sigma^2\)), where $$f0(x) = -4 + x_1 + sin(\pi x_1*x_{44}) - x_{21} + 0.6x_{41}*x_{42} - exp[-2(x_{42}+1)^2] - x_{43}^2 + 0.5x_{44}.$$ If binary = TRUE, sample the binary response \(y\) from Bernoulli(\(\Phi(f0(x))\)) where \(f0\) is defined above and \(\Phi\) is the cumulative density function of the standard normal distribution.

References

Luo, C. and Daniels, M. J. (2021) "Variable Selection Using Bayesian Additive Regression Trees." arXiv preprint arXiv:2112.13998.

Examples

Run this code
# NOT RUN {
data = mixtwo(100, 1, FALSE)
# }

Run the code above in your browser using DataLab