A simulated dataset with 200 individuals and 10 periods. The true data generating process is the following:
Selection equation (ProbitRE - Probit model with individual level random effects):
$$z_{it}=1(1+x_{it}+w_{it}+u_i+\xi_{it} > 0$$
Outcome Equation (PLN_RE - Poisson Lognormal model with individual-time level random effects):
$$E[y_{it}|x_{it},v_i,\epsilon_{it}] = exp(-1+x_{it} + v_i + \epsilon_{it})$$
Correlation (self-selection at both individual and individual-time level):
\(u_i\) and \(v_i\) are bivariate normally distributed with a correlation of 0.25.
\(\xi_{it}\) and \(\epsilon_{it}\) are bivariate normally distributed with a correlation of 0.5.
sim
A simulated dataset with 200 individuals and 10 periods.
id, from 1-200
Time periods, from 1-10
Whether an individual is selected in a given period. Outcome is observed only when z=1
The outcome of an individual in a given period
A covariate influencing both z and y, with true effects being 1
A covariate influencing only z, with true effect being 1