Learn R Programming

causalQual (version 1.0.0)

generate_qualitative_data_iv: Generate Qualitative Data (Instrumental Variables)

Description

Generate a synthetic data set with qualitative outcomes under an instrumental variables design. The data include a binary treatment indicator and a binary instrument. Potential outcomes and potential treatments are independent of the instrument. Moreover, the instrument does not directly impact potential outcomes, has an impact on treatment probability, and can only increase the probability of treatment.

Usage

generate_qualitative_data_iv(n, outcome_type)

Value

A list storing a data frame with the observed data, the true propensity score, the true instrument propensity score, and the true local probabilities of shift.

Arguments

n

Sample size.

outcome_type

String controlling the outcome type. Must be either "multinomial" or "ordered". Affects how potential outcomes are generated.

Author

Riccardo Di Francesco

Details

Outcome type

Potential outcomes are generated differently according to outcome_type. If outcome_type == "multinomial", generate_qualitative_data_iv computes linear predictors for each class using the covariates:

$$\eta_{mi} (d) = \beta_{m1}^d X_{i1} + \beta_{m2}^d X_{i2} + \beta_{m3}^d X_{i3}, \quad d = 0, 1,$$

and then transforms \(\eta_{mi} (d)\) into valid probability distributions using the softmax function:

$$P(Y_i(d) = m | X_i) = \frac{\exp(\eta_{mi} (d))}{\sum_{m'} \exp(\eta_{m'i}(d))}, \quad d = 0, 1.$$

It then generates potential outcomes \(Y_i(1)\) and \(Y_i(0)\) by sampling from {1, 2, 3} using \(P_i(Y(d) = m | X), \, d = 0, 1\).

If instead outcome_type == "ordered", generate_qualitative_data_iv first generates latent potential outcomes:

$$Y_i^* (d) = \tau d + X_{i1} + X_{i2} + X_{i3} + N (0, 1), \quad d = 0, 1,$$

with \(\tau = 2\). It then constructs \(Y_i (d)\) by discretizing \(Y_i^* (d)\) using threshold parameters \(\zeta_1 = 2\) and \(\zeta_2 = 4\). Then,

$$P(Y_i(d) = m | X_i) = P(\zeta_{m-1} < Y_i^*(d) \leq \zeta_m | X_i) = \Phi (\zeta_m - \sum_j X_{ij} - \tau d) - \Phi (\zeta_{m-1} - \sum_j X_{ij} - \tau d), \quad d = 0, 1,$$

which allows us to analytically compute the local probabilities of shift.

Treatment assignment and instrument

The instrument is always generated as \(Z_i \sim \text{Bernoulli}(0.5)\). Treatment is always modeled as \(D_i \sim \text{Bernoulli}(\pi(X_i, Z_i))\), with \(\pi(X_i, Z_i) = P ( D_i = 1 | X_i, Z_i)) = (X_{i1} + X_{i3} + Z_i) / 3\). Thus, \(Z_i\) can increase the probability of treatment intake but cannot decrease it.

Other details

The function always generates three independent covariates from \(U(0,1)\). Observed outcomes \(Y_i\) are always constructed using the usual observational rule.

See Also

generate_qualitative_data_soo generate_qualitative_data_rd generate_qualitative_data_did

Examples

Run this code
## Generate synthetic data.
set.seed(1986)

data <- generate_qualitative_data_iv(100,
                                     outcome_type = "ordered")

data$local_pshifts

Run the code above in your browser using DataLab