bilinear: Recusrive Bivariate Linear Model

Description

Estimate two linear models with bivariate normally distributed error terms.

First stage (Linear): $$m_i=\boldsymbol{\alpha}'\mathbf{w_i}+\lambda u_i$$ Second stage (Linear): $$y_i = \boldsymbol{\beta}'\mathbf{x_i} + {\gamma}m_i + \sigma v_i$$ Endogeneity structure: $u_i$ and $v_i$ are bivariate normally distributed with a correlation of $\rho$.

The identification of this model requires an instrumental variable that appears in w but not x. This model still works if the first-stage dependent variable is not a regressor in the second stage.

Usage

bilinear(form1, form2, data = NULL, par = NULL, method = "BFGS", verbose = 0)

Value

A list containing the results of the estimated model, some of which are inherited from the return of maxLik

estimates: Model estimates with 95% confidence intervals. Prefix "1" means first stage variables.
estimate or par: Point estimates
variance_type: covariance matrix used to calculate standard errors. Either BHHH or Hessian.
var: covariance matrix
se: standard errors
var_bhhh: BHHH covariance matrix, inverse of the outer product of gradient at the maximum
se_bhhh: BHHH standard errors
gradient: Gradient function at maximum
hessian: Hessian matrix at maximum
gtHg: $g'H^-1g$, where H^-1 is simply the covariance matrix. A value close to zero (e.g., <1e-3 or 1e-6) indicates good convergence.
LL or maximum: Likelihood
AIC: AIC
BIC: BIC
n_obs: Number of observations
n_par: Number of parameters
LR_stat: Likelihood ratio test statistic for $\rho=0$
LR_p: p-value of likelihood ratio test
iterations: number of iterations taken to converge
message: Message regarding convergence status.

Note that the list inherits all the components in the output of maxLik. See the documentation of maxLik for more details.

Arguments

form1

Formula for the first linear model

form2

Formula for the second linear model

data

Input data, a data frame

par

Starting values for estimates

method

Optimization algorithm. Default is BFGS

verbose

A integer indicating how much output to display during the estimation process.

<0 - No ouput
0 - Basic output (model estimates)
1 - Moderate output, basic ouput + parameter and likelihood in each iteration
2 - Extensive output, moderate output + gradient values on each call

References

Peng, Jing. (2023) Identification of Causal Mechanisms from Randomized Experiments: A Framework for Endogenous Mediation Analysis. Information Systems Research, 34(1):67-84. Available at https://doi.org/10.1287/isre.2022.1113

Examples

Run this code

library(MASS)
N = 2000
rho = -0.5
set.seed(1)

x = rbinom(N, 1, 0.5)
z = rnorm(N)

e = mvrnorm(N, mu=c(0,0), Sigma=matrix(c(1,rho,rho,1), nrow=2))
e1 = e[,1]
e2 = e[,2]

m = -1 + x + z + e1
y = -1 + x + m + e2

est = bilinear(m~x+z, y~x+m)
print(est$estimates, digits=3)