Learn R Programming

xtdml (version 0.1.11)

xtdml_data_from_data_frame: Initalization of Abstract Class xtdml_data

Description

Wrapper for data-backed initialization from data frame.

Usage

xtdml_data_from_data_frame(
  df,
  x_cols = NULL,
  y_col = NULL,
  d_cols = NULL,
  panel_id = NULL,
  time_id = NULL,
  cluster_cols = NULL,
  approach = NULL,
  transformX = NULL
)

Value

Creates a new instance of class xtdml_data.

Arguments

df

(data.frame())
Data object.

x_cols

(character())
The covariates.

y_col

(character(1))
The outcome variable.

d_cols

(character())
The treatment variable(s).

panel_id

(NULL, character())
The panel identifier. Default is NULL.

time_id

(NULL, character())
The time identifier. Default is NULL.

cluster_cols

(NULL, character())
The cluster variables. Default is panel_id.

approach

(character(1))
A character() ("fd-exact", "wg-approx", "cre" or "pooled") specifying the panel data technique to apply to estimate the causal model. Default is "NULL".

transformX

(character(1))
A character() ("no", "minmax" or "poly") specifying the type of transformation to apply to the X data. "no" does not transform the covariates X and is recommended for tree-based learners. "minmax" applies the Min-Max normalization \(x' = (x-x_{min})/(x_{max}-x_{min})\) to the covariates and is recommended with neural networks. "poly" add polynomials up to order three and interactions between all possible combinations of two and three variables; this is recommended for Lasso. Default is "no".

Examples

Run this code
# Generate simulated panel dataset from `xtdml`
data = make_plpr_data(n_obs = 500, t_per = 10, dim_x = 30, theta = 0.5, rho=0.8)

# Set up DML data environment
x_cols  = paste0("X", 1:30)

obj_xtdml_data = xtdml_data_from_data_frame(data,
                x_cols = x_cols,  y_col = "y", d_cols = "d",
                panel_id = "id",
                time_id = "time",
                cluster_cols = "id",
                approach = "fd-exact",
                transformX = "no")
obj_xtdml_data$print()

Run the code above in your browser using DataLab