xtdml_plr: DML Method for Partially Linear Panel Models

Description

Routine to estimate partially linear panel regression models with fixed effects within double machine learning.

Arguments

Format

R6::R6Class object inheriting from xtdml.

Super class

xtdml::xtdml -> xtdml_plr

Methods

Public methods

Inherited methods

Method `new()`

Creates a new instance of this R6 class.

Usage

xtdml_plr$new(
  data,
  ml_l,
  ml_m,
  ml_g = NULL,
  n_folds = 5,
  n_rep = 1,
  score = "orth-PO",
  dml_procedure = "dml2",
  draw_sample_splitting = TRUE,
  apply_cross_fitting = TRUE
)

Arguments

data: (xtdml_data)
The xtdml_data object providing the data and specifying the variables of the causal model.

ml_l

(LearnerRegr, Learner, character(1))
A learner of the class LearnerRegr, which is available from mlr3 or its extension packages mlr3learners or mlr3extralearners. Alternatively, a Learner object with public field task_type = "regr" can be passed, for example of class GraphLearner. The learner can possibly be passed with specified parameters, for example lrn("regr.cv_glmnet", s = "lambda.min").
ml_l refers to the nuisance function $l_0(X) = E[Y|X]$.

ml_m

(LearnerRegr, LearnerClassif, Learner, character(1))
A learner of the class LearnerRegr, which is available from mlr3 or its extension packages mlr3learners or mlr3extralearners. For binary treatment variables, an object of the class LearnerClassif can be passed, for example lrn("classif.cv_glmnet", s = "lambda.min"). Alternatively, a Learner object with public field task_type = "regr" or task_type = "classif" can be passed, respectively, for example of class GraphLearner.
ml_m refers to the nuisance function $m_0(X) = E[D|X]$.

ml_g

(LearnerRegr, Learner, character(1))
A learner of the class LearnerRegr, which is available from mlr3 or its extension packages mlr3learners or mlr3extralearners. Alternatively, a Learner object with public field task_type = "regr" can be passed, for example of class GraphLearner. The learner can possibly be passed with specified parameters, for example lrn("regr.cv_glmnet", s = "lambda.min").
ml_g refers to the nuisance function $g_0(X) = E[Y - D\theta_0|X]$. Note: The learner ml_g is only required for the score 'IV-type'. Optionally, it can be specified and estimated for callable scores.

n_folds

(integer(1))
Number of folds. Default is 5.

n_rep

(integer(1))
Number of repetitions for the sample splitting. Default is 1.

score

(character(1))
A character(1) ("orth-PO" or "orth-IV"). "orth-PO" is Neyman-orthogonal score with the partialling-out formula. "orth-IV" is Neyman-orthogonal score with the IV-type formula. Default is "orth-PO".

dml_procedure

(character(1))
A character(1) ("dml1" or "dml2") specifying the double machine learning algorithm. Default is "dml2".

draw_sample_splitting

(logical(1))
Indicates whether the sample splitting should be drawn during initialization of the object. Default is TRUE.

apply_cross_fitting

(logical(1))
Indicates whether cross-fitting should be applied. Default is TRUE.

Method `set_ml_nuisance_params()`

Sets hyperparameters for the nuisance models.

Usage

xtdml_plr$set_ml_nuisance_params(
  learner = NULL,
  treat_var = NULL,
  params,
  set_fold_specific = FALSE
)

Arguments

learner: (character(1))
The nuisance model/learner (see method params_names).

treat_var

(character(1))
The treatment varaible (hyperparameters can be set treatment-variable specific).

params

(named list())
A named list() with estimator parameters. Parameters are used for all folds by default. Alternatively, parameters can be passed in a fold-specific way if option fold_specificis TRUE. In this case, the outer list needs to be of length n_rep and the inner list of length n_folds.

set_fold_specific

(logical(1))
Indicates if the parameters passed in params_theta should be passed in fold-specific way. Default is FALSE. If TRUE, the outer list needs to be of length n_rep and the inner list of length n_folds.

Returns

self

Method `tune()`

Conducts hyperparameter-tuning.

The hyperparameter-tuning is performed using the tuning methods provided in the mlr3tuning package. For more information on tuning in mlr3, we refer to the section on parameter tuning in the mlr3 book.

Usage

xtdml_plr$tune(
  param_set,
  tune_settings = list(n_folds_tune = 5, rsmp_tune = mlr3::rsmp("cv", folds = 5), measure
    = NULL, terminator = mlr3tuning::trm("evals", n_evals = 20), algorithm =
    mlr3tuning::tnr("grid_search"), resolution = 5),
  tune_on_folds = FALSE
)

Arguments

param_set: (named list())
A named list with a parameter grid for each nuisance model/learner (see method learner_names()). The parameter grid must be an object of class ParamSet.

tune_settings

(named list())
A named list() with arguments passed to the hyperparameter-tuning with mlr3tuning to set up a tuning instance using mlr3tuning::TuningInstanceBatchSingleCrit$new() (see the mlr3tuning package).

tune_settings has entries

terminator (Terminator)
A Terminator object. Specification of terminator is required to perform tuning.
algorithm (Tuner or character(1))
A Tuner object (recommended) or key passed to the respective dictionary to specify the tuning algorithm used in tnr(). algorithm is passed as an argument to tnr(). If algorithm is not specified by the users, default is set to "grid_search". If set to "grid_search", then additional argument "resolution" is required.
rsmp_tune (Resampling or character(1))
A Resampling object (recommended) or option passed to rsmp() to initialize a Resampling for parameter tuning in mlr3. If not specified by the user, default is set to "cv" (cross-validation).
n_folds_tune (integer(1), optional)
If rsmp_tune = "cv", number of folds used for cross-validation. If not specified by the user, default is set to 5.
measure (NULL, named list(), optional)
Named list containing the measures used for parameter tuning. Entries in list must either be Measure objects or keys to be passed to passed to msr(). The names of the entries must match the learner names (see method learner_names()). If set to NULL, default measures are used, i.e., "regr.mse" for continuous outcome variables and "classif.ce" for binary outcomes.
resolution (character(1))
The key passed to the respective dictionary to specify the tuning algorithm used in tnr(). resolution is passed as an argument to tnr().

tune_on_folds

(logical(1))
Indicates whether the tuning should be done fold-specific or globally. Default is FALSE.

Returns

self

Method `clone()`

The objects of this class are cloneable with this method.

Usage

xtdml_plr$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Details

Consider partially linear panel regression (PLR) model of form

$ Y_{it} = \theta_0 D_{it} + g_0(x_{it}) + \alpha_i + U_{it}$ $ D_{it} = m_0(x_{it}) + \gamma_i + V_{it}$.

Examples

Run this code

# An illustrative example using a regression tree (`rpart`)
library(mlr3)
library(rpart)
library(mlr3tuning)
set.seed(1234)

# Generate simulated dataset
data = make_plpr_data(n_obs = 100, t_per = 5, dim_x = 10, theta = 0.5, rho=0.8)

x_cols  = paste0("X", 1:10)

# Set up DML data environment
obj_xtdml_data = xtdml_data_from_data_frame(data,
                x_cols = x_cols,  y_col = "y", d_cols = "d",
                panel_id = "id",
                time_id = "time",
                approach = "fd-exact")

# Set up DML estimation environment
 learner = lrn("regr.rpart")
 ml_l = learner$clone()
 ml_m = learner$clone()

 obj_xtdml = xtdml_plr$new(obj_xtdml_data,
                           ml_l = ml_l, ml_m = ml_m,
                           score = "orth-PO", n_folds = 3)
# Set up a list of parameter grids
param_grid = list("ml_l" = ps(cp = p_dbl(lower = 0.01, upper = 0.02),
                            maxdepth = p_int(lower = 2, upper = 10)),
                  "ml_m" = ps(cp = p_dbl(lower = 0.01, upper = 0.02),
                            maxdepth = p_int(lower = 2, upper = 10)))

tune_settings = list(n_folds_tune = 3,
                   rsmp_tune = mlr3::rsmp("cv", folds = 3),
                   terminator = mlr3tuning::trm("evals", n_evals = 5),
                   tuner = tnr("grid_search", resolution = 10))

obj_xtdml$tune(param_set = param_grid, tune_settings = tune_settings)
obj_xtdml$fit()

Run the code above in your browser using DataLab

Description

Arguments

Format

Super class

Methods

Public methods

Method new()

Usage

Arguments

Method set_ml_nuisance_params()

Usage

Arguments

Returns

Method tune()

Usage

Arguments

Returns

Method clone()

Usage

Arguments

Details

See Also

Examples

Method `new()`

Method `set_ml_nuisance_params()`

Method `tune()`

Method `clone()`