Estimate a Poisson Lognormal model and a linear model with bivariate normally distributed error/heterogeneity terms.
First stage (Poisson Lognormal):
$$E[m_i|w_i,u_i]=exp(\boldsymbol{\alpha}'\mathbf{w_i}+\lambda u_i)$$
Second stage (Linear):
$$y_i = \boldsymbol{\beta}'\mathbf{x_i} + {\gamma}m_i + \sigma v_i$$
Endogeneity structure:
\(u_i\) and \(v_i\) are bivariate normally distributed with a correlation of \(\rho\).
This model is typically well-identified even if w and x are the same set of variables. This model still works if the first-stage dependent variable is not a regressor in the second stage.
pln_linear(
form_pln,
form_linear,
data = NULL,
par = NULL,
method = "BFGS",
init = c("zero", "unif", "norm", "default")[4],
H = 20,
verbose = 0
)
A list containing the results of the estimated model, some of which are inherited from the return of maxLik
estimates: Model estimates with 95% confidence intervals. Prefix "pln" means first stage variables.
estimate or par: Point estimates
variance_type: covariance matrix used to calculate standard errors. Either BHHH or Hessian.
var: covariance matrix
se: standard errors
gradient: Gradient function at maximum
hessian: Hessian matrix at maximum
gtHg: \(g'H^-1g\), where H^-1 is simply the covariance matrix. A value close to zero (e.g., <1e-3 or 1e-6) indicates good convergence.
LL or maximum: Likelihood
AIC: AIC
BIC: BIC
n_obs: Number of observations
n_par: Number of parameters
LR_stat: Likelihood ratio test statistic for \(\rho=0\)
LR_p: p-value of likelihood ratio test
iterations: number of iterations taken to converge
message: Message regarding convergence status.
Note that the list inherits all the components in the output of maxLik. See the documentation of maxLik for more details.
Formula for the first-stage Poisson lognormal model
Formula for the second-stage linear model
Input data, a data frame
Starting values for estimates
Optimization algorithm.
Initialization method
Number of quadrature points
A integer indicating how much output to display during the estimation process.
<0 - No ouput
0 - Basic output (model estimates)
1 - Moderate output, basic ouput + parameter and likelihood in each iteration
2 - Extensive output, moderate output + gradient values on each call
Peng, Jing. (2023) Identification of Causal Mechanisms from Randomized Experiments: A Framework for Endogenous Mediation Analysis. Information Systems Research, 34(1):67-84. Available at https://doi.org/10.1287/isre.2022.1113
Other endogeneity:
bilinear()
,
biprobit()
,
biprobit_latent()
,
biprobit_partial()
,
linear_probit()
,
pln_probit()
,
probit_linear()
,
probit_linearRE()
,
probit_linear_latent()
,
probit_linear_partial()
library(MASS)
N = 2000
rho = -0.5
set.seed(1)
x = rbinom(N, 1, 0.5)
z = rnorm(N)
e = mvrnorm(N, mu=c(0,0), Sigma=matrix(c(1,rho,rho,1), nrow=2))
e1 = e[,1]
e2 = e[,2]
m = rpois(N, exp(1 + x + z + e1))
y = 1 + x + m + e2
est = pln_linear(m~x+z, y~x+m)
print(est$estimates, digits=3)
Run the code above in your browser using DataLab