Learn R Programming

DeepLearningCausal (version 0.0.107)

metalearner_deeplearning: metalearner_deeplearning

Description

metalearner_deeplearning implements the meta learners for estimating CATEs using Deep Neural Networks through Tensorflow. Deep Learning Estimation of CATEs from four meta-learner models (S,T,X and R-learner) using TensorFlow and Keras3

Usage

metalearner_deeplearning(
  data = NULL,
  train.data = NULL,
  test.data = NULL,
  cov.formula,
  treat.var,
  meta.learner.type,
  nfolds = 5,
  algorithm = "adam",
  hidden.layer = c(2, 2),
  hidden_activation = "relu",
  output_activation = "linear",
  output_units = 1,
  loss = "mean_squared_error",
  metrics = "mean_squared_error",
  epoch = 10,
  verbose = 1,
  batch_size = 32,
  validation_split = NULL,
  patience = NULL,
  dropout_rate = NULL,
  conformal = FALSE,
  alpha = 0.1,
  calib_frac = 0.5,
  prob_bound = TRUE,
  seed = 1234
)

Value

metalearner_deeplearning object with CATEs

Arguments

data

data.frame object of data. If a single dataset is specified, then the model will use cross-validation to train the meta-learners and estimate CATEs. Users can also specify the arguments (defined below) to separately train meta-learners on their training data and estimate CATEs with their test data.

train.data

data.frame object of training data for Train/Test mode. Argument must be specified to separately train the meta-learners on the training data.

test.data

data.frame object of test data for Train/Test mode. Argument must be specified to estimate CATEs on the test data.

cov.formula

formula description of the model y ~ x(list of covariates). Permits users to specify covariates in the meta-learner model of interest. This includes the outcome variables and the confounders.

treat.var

string for name of Treatment variable. Users can specify the treatment variable in their data by employing the treat.var argument.

meta.learner.type

string of "S.Learner", "T.Learner", "X.Learner", or "R.Learner". Employed to specify any of the following four meta-learner models for estimation via deep learning: S,T,X or R-Learner.

nfolds

integer for number of folds for Meta Learners. When a single dataset is specified, then users employ cross-validation to train the meta-learners and estimate CATEs. For a single dataset, users specify nfolds to define the number of folds to split data for cross-validation.

algorithm

string for optimization algorithm. For optimizers available see keras package. Arguments to reconfigure and train the deep neural networks for meta-learner estimation include the optimization algorithm. Options for the optimization alogrithm include "adam", “adagrad”, “rmsprop”, “sgd”.

hidden.layer

permits users to specify the number of hidden layers in the model and the number of neurons in each hidden layer.

hidden_activation

string or vector for name of activation function for hidden layers of model. Defaults to "relu" which means that users can specify a single value to use one activation function for each hidden layer. While "relu" is a popular choice for hidden layers, users can also use "softmax" which converts a vector of values into a probability distribution and "tanh" that maps input to a value between -1 and 1.

output_activation

string for name of activation function for output layer of model. "linear" is recommended for continuous outcome variables, and "sigmoid" for binary outcome variables. For activation functions available see keras package. 'For instance, Keras provides various activation functions that can be used in neural network layers to introduce non-linearity

output_units

integer for units in output layer. Defaults to 1 for continuous and binary outcome variables. In case of multinomial outcome variable, set to the number of categories.

loss

string for loss function "mean_squared_error" recommended for linear models, "binary_crossentropy" for binary models.

metrics

string for metrics in response model. "mean_squared_error" recommended for linear models, "binary_accuracy" for binary models.

epoch

interger for number of epochs. epoch denotes one complete pass through the entire training dataset. Model processes each training example once during an epoch.

verbose

integer specifying the verbosity level during training. 1 for full information and learning curve plots. 0 to suppress messages and plots.

batch_size

integer for batch size to split training data. batch size refers to the number of training samples processed before the model's parameters are updated. Batch size is a vital hyperparameter that affects both training speed and model performance. It is crucial for computational efficiency.

validation_split

double for proportion of training data to split for validation. validation split involves partitioning data into training and validation sets to build and tune model.

patience

integer for number of epochs with no improvement to wait before stopping training. patience stops training of neural network if model's performance on validation data stops improving.

dropout_rate

double or vector for proportion of hidden layer to drop out. dropout rate is hyperparameter for preventing a model from overfitting the training data.

conformal

logical for whether to compute conformal prediction intervals conformal prediction intervals provide measure of uncertainty for ITEs.

alpha

proportion for conformal prediction intervals alpha proportion refers to significance level that guarantees desired coverage probability for ITEs

calib_frac

fraction of training data to use for calibration in conformal inference

prob_bound

logical for whether to bound conformal intervals within [-1,1] for classification models

seed

random seed

Examples

Run this code
if (FALSE) {
#check for python and required modules
python_ready()
data("exp_data")

s_deeplearning <- metalearner_deeplearning(data = exp_data,
cov.formula  = support_war ~ age + female + income + education
+ employed + married + hindu + job_loss, 
treat.var = "strong_leader",  meta.learner.type = "S.Learner",
nfolds = 5,  algorithm = "adam",
hidden.layer = c(2,2),  hidden_activation = "relu",
output_activation = "sigmoid", output_units = 1,
loss = "binary_crossentropy",  metrics = "accuracy",
epoch = 10,  verbose = 1,   batch_size = 32, 
validation_split = NULL,  patience = NULL,
dropout_rate = NULL, conformal= FALSE,  seed=1234)
}
if (FALSE) {
#check for python and required modules
python_ready()
data("exp_data")

t_deeplearning <- metalearner_deeplearning(data = exp_data,
cov.formula  = support_war ~ age + female + income + education
+ employed + married + hindu + job_loss, 
treat.var = "strong_leader",  meta.learner.type = "T.Learner",
nfolds = 5,  algorithm = "adam",
hidden.layer = c(2,2),  hidden_activation = "relu",
output_activation = "sigmoid", output_units = 1,
loss = "binary_crossentropy",  metrics = "accuracy",
epoch = 10,  verbose = 1,   batch_size = 32, 
validation_split = NULL,  patience = NULL,
dropout_rate = NULL, conformal= TRUE, 
alpha = 0.1,calib_frac = 0.5, prob_bound = TRUE, seed = 1234)
}

Run the code above in your browser using DataLab