Learn R Programming

endogeneity (version 2.1.4)

probit_linearRE: Recursive Probit-LinearRE Model

Description

A panel extension of the probit_linear model. The first stage is a probit model at the individual level. The second stage is a panel linear model at the individual-time level with individual-level random effects. The random effect is correlated with the error term in the first stage.

First stage (Probit): $$m_i=1(\boldsymbol{\alpha}'\mathbf{w_i}+u_i>0)$$ Second stage (Panel linear model with individual-level random effects): $$y_{it} = \boldsymbol{\beta}'\mathbf{x_{it}} + {\gamma}m_i + \lambda v_i +\sigma \epsilon_{it}$$ Endogeneity structure: \(u_i\) and \(v_i\) are bivariate normally distributed with a correlation of \(\rho\).

This model uses Adaptive Gaussian Quadrature to overcome numerical challenges with long panels. w and x can be the same set of variables. Identification can be weak if w are not good predictors of m. This model still works if the first-stage dependent variable is not a regressor in the second stage.

Usage

probit_linearRE(
  form_probit,
  form_linear,
  id,
  data = NULL,
  par = NULL,
  method = "BFGS",
  H = 20,
  stopUpdate = F,
  init = c("zero", "unif", "norm", "default")[4],
  verbose = 0
)

Value

A list containing the results of the estimated model, some of which are inherited from the return of maxLik

  • estimates: Model estimates with 95% confidence intervals

  • estimate or par: Point estimates

  • variance_type: covariance matrix used to calculate standard errors. Either BHHH or Hessian.

  • var: covariance matrix

  • se: standard errors

  • var_bhhh: BHHH covariance matrix, inverse of the outer product of gradient at the maximum

  • se_bhhh: BHHH standard errors

  • gradient: Gradient function at maximum

  • hessian: Hessian matrix at maximum

  • gtHg: \(g'H^-1g\), where H^-1 is simply the covariance matrix. A value close to zero (e.g., <1e-3 or 1e-6) indicates good convergence.

  • LL or maximum: Likelihood

  • AIC: AIC

  • BIC: BIC

  • n_obs: Number of observations

  • n_par: Number of parameters

  • time: Time takes to estimate the model

  • LR_stat: Likelihood ratio test statistic for \(\rho=0\)

  • LR_p: p-value of likelihood ratio test

  • iterations: number of iterations taken to converge

  • message: Message regarding convergence status.

Note that the list inherits all the components in the output of maxLik. See the documentation of maxLik for more details.

Arguments

form_probit

Formula for the probit model at the individual level

form_linear

Formula for the linear model at the individual-time level

id

group id, character if data supplied or numerical vector if data not supplied

data

Input data, must be a data.table object

par

Starting values for estimates

method

Optimization algorithm. Default is BFGS

H

Number of quadrature points

stopUpdate

Adaptive Gaussian Quadrature disabled if TRUE

init

Initialization method

verbose

A integer indicating how much output to display during the estimation process.

  • <0 - No ouput

  • 0 - Basic output (model estimates)

  • 1 - Moderate output, basic ouput + parameter and likelihood in each iteration

  • 2 - Extensive output, moderate output + gradient values on each call

References

Chen, H., Peng, J., Li, H., & Shankar, R. (2022). Impact of Refund Policy on Sales of Paid Information Services: The Moderating Role of Product Characteristics. Available at SSRN: https://ssrn.com/abstract=4114972.

See Also

Other endogeneity: bilinear(), biprobit(), biprobit_latent(), biprobit_partial(), linear_probit(), pln_linear(), pln_probit(), probit_linear(), probit_linear_latent(), probit_linear_partial()

Examples

Run this code
library(MASS)
library(data.table)
N = 500
period = 5
obs = N*period
rho = -0.5
set.seed(100)

e = mvrnorm(N, mu=c(0,0), Sigma=matrix(c(1,rho,rho,1), nrow=2))
e1 = e[,1]
e2 = e[,2]

t = rep(1:period, N)
id = rep(1:N, each=period)
w = rnorm(N)
m = as.numeric(1+w+e1>0)
m_long = rep(m, each=period)

x = rnorm(obs)
y = 1 + x + m_long + rep(e2, each=period) + rnorm(obs)

dt = data.table(y, x, id, t, m=rep(m, each=period), w=rep(w, each=period))

est = probit_linearRE(m~w, y~x+m, 'id', dt)
print(est$estimates, digits=3)

Run the code above in your browser using DataLab