metalearner_deeplearning: metalearner_deeplearning

Description

metalearner_deeplearning implements the meta learners for estimating CATEs using Deep Neural Networks through Tensorflow. Deep Learning Estimation of CATEs from four meta-learner models (S,T,X and R-learner) using TensorFlow and Keras3

Usage

metalearner_deeplearning(
  data = NULL,
  train.data = NULL,
  test.data = NULL,
  cov.formula,
  treat.var,
  meta.learner.type,
  nfolds = 5,
  algorithm = "adam",
  hidden.layer = c(2, 2),
  hidden_activation = "relu",
  output_activation = "linear",
  output_units = 1,
  loss = "mean_squared_error",
  metrics = "mean_squared_error",
  epoch = 10,
  verbose = 1,
  batch_size = 32,
  validation_split = NULL,
  patience = NULL,
  dropout_rate = NULL,
  conformal = FALSE,
  alpha = 0.1,
  calib_frac = 0.5,
  prob_bound = TRUE,
  seed = 1234
)

Value

metalearner_deeplearning object with CATEs

Arguments

data: data.frame object of data. If a single dataset is specified, then the model will use cross-validation to train the meta-learners and estimate CATEs. Users can also specify the arguments (defined below) to separately train meta-learners on their training data and estimate CATEs with their test data.
train.data: data.frame object of training data for Train/Test mode. Argument must be specified to separately train the meta-learners on the training data.
test.data: data.frame object of test data for Train/Test mode. Argument must be specified to estimate CATEs on the test data.
cov.formula: formula description of the model y ~ x(list of covariates). Permits users to specify covariates in the meta-learner model of interest. This includes the outcome variables and the confounders.
treat.var: string for name of Treatment variable. Users can specify the treatment variable in their data by employing the treat.var argument.
meta.learner.type: string of "S.Learner", "T.Learner", "X.Learner", or "R.Learner". Employed to specify any of the following four meta-learner models for estimation via deep learning: S,T,X or R-Learner.
nfolds: integer for number of folds for Meta Learners. When a single dataset is specified, then users employ cross-validation to train the meta-learners and estimate CATEs. For a single dataset, users specify nfolds to define the number of folds to split data for cross-validation.
algorithm: string for optimization algorithm. For optimizers available see keras package. Arguments to reconfigure and train the deep neural networks for meta-learner estimation include the optimization algorithm. Options for the optimization alogrithm include "adam", “adagrad”, “rmsprop”, “sgd”.
hidden.layer: permits users to specify the number of hidden layers in the model and the number of neurons in each hidden layer.
hidden_activation: string or vector for name of activation function for hidden layers of model. Defaults to "relu" which means that users can specify a single value to use one activation function for each hidden layer. While "relu" is a popular choice for hidden layers, users can also use "softmax" which converts a vector of values into a probability distribution and "tanh" that maps input to a value between -1 and 1.
output_activation: string for name of activation function for output layer of model. "linear" is recommended for continuous outcome variables, and "sigmoid" for binary outcome variables. For activation functions available see keras package. 'For instance, Keras provides various activation functions that can be used in neural network layers to introduce non-linearity
output_units: integer for units in output layer. Defaults to 1 for continuous and binary outcome variables. In case of multinomial outcome variable, set to the number of categories.
loss: string for loss function "mean_squared_error" recommended for linear models, "binary_crossentropy" for binary models.
metrics: string for metrics in response model. "mean_squared_error" recommended for linear models, "binary_accuracy" for binary models.
epoch: interger for number of epochs. epoch denotes one complete pass through the entire training dataset. Model processes each training example once during an epoch.
verbose: integer specifying the verbosity level during training. 1 for full information and learning curve plots. 0 to suppress messages and plots.
batch_size: integer for batch size to split training data. batch size refers to the number of training samples processed before the model's parameters are updated. Batch size is a vital hyperparameter that affects both training speed and model performance. It is crucial for computational efficiency.
validation_split: double for proportion of training data to split for validation. validation split involves partitioning data into training and validation sets to build and tune model.
patience: integer for number of epochs with no improvement to wait before stopping training. patience stops training of neural network if model's performance on validation data stops improving.
dropout_rate: double or vector for proportion of hidden layer to drop out. dropout rate is hyperparameter for preventing a model from overfitting the training data.
conformal: logical for whether to compute conformal prediction intervals conformal prediction intervals provide measure of uncertainty for ITEs.
alpha: proportion for conformal prediction intervals alpha proportion refers to significance level that guarantees desired coverage probability for ITEs
calib_frac: fraction of training data to use for calibration in conformal inference
prob_bound: logical for whether to bound conformal intervals within [-1,1] for classification models
seed: random seed

Examples

Run this code

if (FALSE) {
#check for python and required modules
python_ready()
data("exp_data")

s_deeplearning <- metalearner_deeplearning(data = exp_data,
cov.formula  = support_war ~ age + female + income + education
+ employed + married + hindu + job_loss, 
treat.var = "strong_leader",  meta.learner.type = "S.Learner",
nfolds = 5,  algorithm = "adam",
hidden.layer = c(2,2),  hidden_activation = "relu",
output_activation = "sigmoid", output_units = 1,
loss = "binary_crossentropy",  metrics = "accuracy",
epoch = 10,  verbose = 1,   batch_size = 32, 
validation_split = NULL,  patience = NULL,
dropout_rate = NULL, conformal= FALSE,  seed=1234)
}
if (FALSE) {
#check for python and required modules
python_ready()
data("exp_data")

t_deeplearning <- metalearner_deeplearning(data = exp_data,
cov.formula  = support_war ~ age + female + income + education
+ employed + married + hindu + job_loss, 
treat.var = "strong_leader",  meta.learner.type = "T.Learner",
nfolds = 5,  algorithm = "adam",
hidden.layer = c(2,2),  hidden_activation = "relu",
output_activation = "sigmoid", output_units = 1,
loss = "binary_crossentropy",  metrics = "accuracy",
epoch = 10,  verbose = 1,   batch_size = 32, 
validation_split = NULL,  patience = NULL,
dropout_rate = NULL, conformal= TRUE, 
alpha = 0.1,calib_frac = 0.5, prob_bound = TRUE, seed = 1234)
}

Run the code above in your browser using DataLab