Data-backed environment for Double machine learning (DML) that cannot be initialized.
xtdml_data sets up the data environment for panel data analysis with transformed variables.
The xtdml_data_from_data_frame() function can be used to create a new
instance of xtdml_data from a data.frame.
all_variables(character())
All variables available in the data frame.
d_cols(character())
The treatment variable.
dbar_col(NULL, character()`)
The individual mean of the treatment variable.
data(data.table)
Data object.
data_model(data.table)
Internal data object that implements the causal panel model as specified by
the user via y_col, d_cols, x_cols, dbar_col.
n_obs(integer(1))
The number of observations.
n_treat(integer(1))
The number of treatment variables.
treat_col(character(1))
"Active" treatment variable in the multiple-treatment case.
x_cols(character())
The covariates.
y_col(character(1))
The outcome variable.
panel_id(character())
The panel identifier.
time_id(character())
The time identifier.
cluster_cols(character())
The cluster variable(s).
n_cluster_vars(integer(1))
The number of cluster variables.
approach(character(1))
A character() ("fd-exact", "wg-approx" or "cre") specifying the panel data
technique to apply to estimate the causal model. Default is "fd-exact".
transformX(character(1))
A character() ("no", "minmax" or "poly") specifying the type
of transformation to apply to the X data. "no" does not transform the covariates X
and is recommended for tree-based learners. "minmax" applies the Min-Max normalization
\(x' = (x-x_{min})/(x_{max}-x_{min})\) to the covariates and is recommended with neural networks.
"poly" add polynomials up to order three and interactions between all possible
combinations of two and three variables; this is recommended for Lasso. Default is "no".
new()Creates a new instance of this R6 class.
xtdml_data$new(
data = NULL,
x_cols = NULL,
y_col = NULL,
d_cols = NULL,
dbar_col = NULL,
panel_id = NULL,
time_id = NULL,
cluster_cols = NULL,
approach = NULL,
transformX = NULL
)data(data.table, data.frame())
Data object.
x_cols
y_col(character(1))
The outcome variable.
d_cols(character(1))
The treatment variable.
dbar_col(NULL, character()) \cr Individual mean of the treatment variable (used for the CRE approach). Default is NULL`.
panel_id(character())
The panel identifier.
time_id(character())
The time identifier.
cluster_cols(character())
The cluster variable(s).
approach(character(1))
A character() ("fd-exact", "wg-approx" or "cre")
specifying the panel data technique to apply
to estimate the causal model. Default is "fd-exact".
transformX(character(1))
A character() ("no", "minmax" or "poly") specifying the type
of transformation to apply to the X data. "no" does not transform the covariates X
and is recommended for tree-based learners. "minmax" applies the Min-Max normalization
\(x' = (x-x_{min})/(x_{max}-x_{min})\) to the covariates and is recommended with neural networks.
"poly" add polynomials up to order three and interactions between all possible
combinations of two and three variables; this is recommended for Lasso.
Default is "no".
plot()Plotting method, which is not implemented for xtdml objects.
Attempting to call it returns an informative message.
Use the print() method to view xtdml_data objects.
xtdml_data$plot()
set_data_model()Setter function for data_model.
The function implements the causal model
as specified by the user via y_col, d_cols, x_cols, panel_id, time_id and
cluster_cols and assigns the role for the treatment variables in the
multiple-treatment case.
xtdml_data$set_data_model(treatment_var)treatment_var(character())
Active treatment variable that will be set to treat_col.
clone()The objects of this class are cloneable with this method.
xtdml_data$clone(deep = FALSE)deepWhether to make a deep clone.