This code generates population level data to test the estimation function. This data set incorporates splines in the MTRs.
gendistSplines()
a list of two data.frame objects. One is the distribution of the simulated data, the other is the full simulated data set.
The distribution of the data is as follows
| Z X/Z | 0 1 _______|___________ -1 | 0.1 0.1 | X 0 | 0.2 0.2 | 1 | 0.1 0.2
The data presented below will have already integrated over the unobservable terms U, and U | X, Z ~ Unif[0, 1].
The propensity scores are generated according to the model
p(x, z) = 0.5 - 0.1 * x + 0.2 * z
| Z p(X,Z) | 0 1 _______|___________ -1 | 0.6 0.8 | X 0 | 0.5 0.7 | 1 | 0.4 0.6
The lowest common multiple of the first table is 12. The lowest common multiple of the second table is 84. It turns out that 840 * 5 = 4200 observations is enough to generate the population data set, such that each group has a whole-number of observations.
The MTRs are defined as follows:
y1 ~ beta0 + beta1 * x + uSpline(degree = 2, knots = c(0.3, 0.6), intercept = FALSE)
The coefficients (beta1, beta2), and the coefficients on the splines, will be defined below.
y0 = x : uSpline(degree = 0, knots = c(0.2, 0.5, 0.8), intercept = TRUE) + uSpline(degree = 1, knots = c(0.4), intercept = TRUE) + beta3 * I(u ^ 2)
The coefficient beta3, and the coefficients on the splines, will be defined below.