RCTREP: Replicate treatment effect estimates obtained from a randomized control trial using observational data

Description

The function RCTREP is used to validate the estimates of treatment effects obtained from observational data by comparing to estimates from a target randomized control trial. The function currently implements the following types of estimators of treatment effects: G_computation, inverse propensity score weighting (IPW), and augmented propensity score weighting. The function implements the following three types of weighting estimators to compare the resulting estimates of treatment effects from RWD to the target RCT: exact matching weights, inverse selection probability weighting, and sub-classification. Since we regard the sample in the RCT as the target population, weights for each individual in observational data is \(p/(1-p)\) so that the weighted population of observational data is representative to the target population.

Usage

RCTREP(
  TEstimator = "G_computation",
  SEstimator = "Exact",
  source.data = source.data,
  target.data = target.data,
  source.name = "RWD",
  target.name = "RCT",
  vars_name,
  selection_predictors,
  outcome_method = "glm",
  treatment_method = "glm",
  weighting_method = "glm",
  outcome_formula = NULL,
  treatment_formula = NULL,
  selection_formula = NULL,
  stratification = NULL,
  stratification_joint = FALSE,
  strata_cut_source = NULL,
  strata_cut_target = NULL,
  two_models = FALSE,
  data.public = TRUE,
  ...
)

Value

A list of length three with three R6 class objects, source.obj, target.obj and source.rep.obj

Arguments

TEstimator: A character specifying an estimator for conditional average treatment effects. The allowed estimators for TEstimator are: "G_computation", "IPW", and "DR". The corresponding object will be created by the wrapper function TEstimator_wrapper(). The default is "G_computation", which, along with outcome_method="glm" models the potential outcomes.
SEstimator: A character specifying an estimator for weight. The allowed estimators are: "Exact", "Subclass", "ISW". The default is "Exact", which, implements the exact matching on variables in selection_predictors to balance the population covariates between source.data and target.data.
source.data: A data frame containing variables named in vars_name and possible other variables. source.obj is instantiated using source.data.
target.data: A data frame containing variables named in vars_name and possible other variables. target.obj is instantiated using target.data.
source.name: A character indicating the name of source.obj.
target.name: A character indicating the name of target.obj.
vars_name: A list containing four vectors outcome_predictors, treatment_name, and outcome_name. outcome_predictors is a character vector containing the adjustment variables, which, along with TEstimator and the corresponding outcome_method or treatment_method to correct for confounding; outcome_name is a character vector of length one containing the variable name of outcome; treatment_name is a character vector of length one containing the variable name of treatment.
selection_predictors: a character vector specifying variable names. The weights are estimated based on the variables.
outcome_method, treatment_method, weighting_method: A character specifying model for outcome, treatment, and weight to use. Possible values are found using names(getModelInfo()). See http://topepo.github.io/caret/train-models-by-tag.html.
outcome_formula, treatment_formula, selection_formula: An optional object of class formula describing the outcome model specification, treatment model specification, and selection model specification.
stratification: An optional character vector containing variables to select subgroups. source.obj will compute both weighted and unweighted average treatment effects of the subgroups, targe.obj will calculate the average treatment effects of the subgroups.
stratification_joint: An optional logical indicating if the subgroups are selected based on levels of combined variables in stratification or levels of individual variable in stratification.
strata_cut_source: An optional list containing lists. Each component is a list with tag named by a variable in source.data to discretize, containing break which is a vector specifying the interval of range of the variable to divide, lable which is a character vector specifying how to code value in the variable according to which interval they fall. The leftmost interval corresponds to level one, the next leftmost to level two and so on. This parameter is useful in the case we concern the integrated treatment effect conditioning on variables with multiple levels (for instance, continuous variable or ordinal variable with multiple levels). Note that we first model based on these continuous variables, then we discretize these variables according to strata_cut. The variables in data of TEstimator object are discretized, and the weight is calculated based on the discretized variables.
strata_cut_target: An optional list containing lists. Each component is a list with tag named by a variable in target.data to discretize.
two_models: An optional logical indicating whether potential outcomes should be modeled separately when TEstimator="DR". Default is FALSE.
data.public: An optional logical indicating whether the data in the output objects are public. Default is TRUE.
...: An optional argument passed to fit() of each estimator object for model training and tuning. See https://topepo.github.io/caret/model-training-and-tuning.html for details.

Details

An R6 object is constructed by a wrapper function TEstimator_wrapper and SEstimator_wrapper with user's input of data and estimators for treatment effect and weight. TEstimator_wrapper() returns initialized objects source.obj and target.obj. SEstimator_wrapper() weights the estimates of source.obj via the class method RCTrep(). The weights are computed using data in the source object source.obj, target object target.obj, and estimator of weights SEstimator.

Examples

Run this code

# \donttest{
output <- RCTREP(TEstimator = "G_computation", SEstimator = "Exact",
                 outcome_method = "BART",
                 source.data = RCTrep::source.data[sample(dim(RCTrep::source.data)[1],500),],
                 target.data = RCTrep::target.data[sample(dim(RCTrep::target.data)[1],500),],
                 vars_name = list(outcome_predictors =
                                    c("x1","x2","x3","x4","x5","x6"),
                                 treatment_name = c('z'),
                                 outcome_name = c('y')),
                 selection_predictors = c("x2","x6"),
                 stratification = c("x1","x3","x4","x5"),
                 stratification_joint = TRUE)
output$target.obj
output$source.obj
output$source.rep.obj
# }

Run the code above in your browser using DataLab