Learn R Programming

bartXViz (version 1.0.8)

barps: Bayesian Additive Regression Trees with Post-Stratification (BARP)

Description

This function uses Bayesian Additive Regression Trees (BART) to extrapolate survey data to a level of geographic aggregation at which the original survey was not sampled to be representative of. This is a modified version of the barp function from the BARP to allow for seed fixation.(https://github.com/jbisbee1/BARP)

Usage

barps(
  y,
  x,
  dat,
  census,
  geo.unit,
  algorithm = "BARP",
  setSeed = NULL,
  proportion = "None",
  cred_int = c(0.025, 0.975),
  BSSD = FALSE,
  nsims = 200,
  ...
)

Value

Returns an object of class BARP, containing a list of the following components:

pred.opn

A data.frame where each row corresponds to the geographic unit of interest and the columns summarize the predicted outcome and the upper and lower bounds for the given credible interval (cred_int).

trees

A bartMachine object.

risk

A data.frame containing the cross-validation risk for each algorithm and the associated weight used in the ensemble predictions. Only useful when multiple algorithms are used.

barp.dat

Data containing the estimates and credible intervals for each observation in the input census dataset.

setSeed

The random seed value employed during model estimation using bartMachine.

proportion

The number of observations in each combination of features.

x

The names of the explanatory variables included in the model.

Arguments

y

Outcome of interest. Should be a character of the column name containing the variable of interest.

x

Prognostic covariates. Should be a vector of column names corresponding to the covariates used to predict the outcome variable of interest.

dat

Survey data containing the x and y column names. The explanatory variables X included in the model must be converted to factors prior to input.

census

Census data containing the x column names. It must also have the same structure as X. If the user provides raw census data, BARP will calculate proportions for each unique bin of x covariates. Otherwise, the researcher must calculate bin proportions and indicate the column name that contains the proportions, either as percentages or as raw counts.

geo.unit

The column name corresponding to the unit at which outcomes should be aggregated.

algorithm

Algorithm for predicting opinions. Can be any algorithm(s) included in the SuperLearner package. If multiple algorithms are listed, predicted opinions are provided for each separately, as well as for the weighted ensemble. Defaults to BARP which implements Bayesian Additive Regression Trees via bartMachine.

setSeed

Seed to control random number generation.

proportion

The column name corresponding to the proportions for covariate bins in the Census data. If left to the default None value, BARP assumes raw census data and estimates bin proportions automatically.

cred_int

A vector giving the lower and upper bounds on the credible interval for the predictions.

BSSD

Calculate bootstrapped standard deviation. Defaults to FALSE in which case the standard deviation is generated by BART's default.

nsims

The number of bootstrap simulations.

...

Additional arguments to be passed to bartMachine or SuperLearner.

See Also

barps is used to implement Bayesian Additive Regression Trees based on the bartMachine package. For detailed options, see https://CRAN.R-project.org/package=bartMachine.

barps also uses the SuperLearner package to implement alternative regularizers. For more details, see https://CRAN.R-project.org/package=SuperLearner.