copulaCorrection: Fitting Linear Models Endogeneous Regressors using Gaussian Copula

Description

Fits linear models with continuous or discrete endogeneous regressors using Gaussian copulas, method presented in Park and Gupta (2012). This is a statistical technique to address the endogeneity problem, where no external instrumental variables are needed. The important assumption of the model is that the endogeneous variables should NOT be normally distributed.

Usage

copulaCorrection(y,X,P,param,type, method, intercept, data)

Arguments

the vector or matrix containing the dependent variable.

the data frame or matrix containing the regressors of the model, both exogeneous and endogeneous. The last column/s should contain the endogenous variable/s.

the matrix.vector containing the endogenous variables.

param

the vector of initial values for the parameters of the model to be supplied to the optimization algorithm. The parameters to be estimated are theta = {b,a,rho,sigma}, where b are the parameters of the exogenous variables, a is the parameter of the endogenous variable, rho is the parameter for the correlation between the error and the endogenous regressor, while sigma is the standard deviation of the structural error.

type

the type of the endogenous regressor/s. It can take two values, "continuous" or "discrete".

method

the method used for estimating the model. It can take two values, "1" or "2", where "1" is the ML approach described in Park and Gupta (2012), and "2" is the equivalent OLS approach described in the same paper. "1" can be applied when there is just a single, continous endogenous variable. With one discrete or more than one continuous endogenous regressors, the second method is applied by default.

intercept

optional parameter. The model is estimated by default with intercept. If no intercept is desired or the regressors matrix X contains already a column of ones, intercept should be given the value "no".

data

optional data frame or matrix containing the variables of the model.

Value

Depending on the method and the type of the variables, it returns the optimal values of the parameters and their standard errors in the case of the second method. With one endogenous variable, if the maximum likelihood approach is chosen, the standard errors can be computed by bootsptrapping using the boots function from the same package.

Details

The maximum likelihood estimation is performed by the "BFGS" algorithm. When there are two endogenous regressors, there is no need for initial parameters since the method applied is by default the augmented OLS, which can be specified by using method two - "method="2"".

References

Park, S. and Gupta, S., (2012), 'Handling Endogeneous Regressors by Joint Estimation Using Copulas', Marketing Science, 31(4), 567-86.

Examples

Run this code

#load dataset dataCopC1, where P is endogenous, continuous and not normally distributed
data(dataCopC1)
y <- dataCopC1[,1]
X <- dataCopC1[,2:5]
P <- dataCopC1[,5]
c1 <- copulaCorrection(y, X, P, type = "continuous", method = "1", intercept=FALSE)
summary(c1)
# to obtain the standard errors use the boots() function
# se.c1 <- boots(10, y, X, P, param = c(1,1,-2,-0.5,0.2,1), intercept=FALSE)

# an alternative model can be obtained using "method ="2"".
c12 <- copulaCorrection(y, X, P, type = "continuous", method = "2", intercept=FALSE)
summary(c12)

# load dataset with 2 continuous, non-normally distributed endogeneous regressors.
# with 2 endogeneous regressors no initial parameters needed, the default is the augmented OLS.
data(dataCopC2)
y <- dataCopC2[,1]
X <- dataCopC2[,2:6]
P <- dataCopC2[,5:6]
c2 <- copulaCorrection(y, X, P, type = "continuous" ,method="2", intercept=FALSE)
summary(c2)

# load dataset with 1 discrete endogeneous variable. 
# having more than 1 discrete endogenous regressor is also possible
data(dataCopDis)
y <- dataCopDis[,1]
X <- dataCopDis[,2:5]
P <- dataCopDis[,5]
c3 <- copulaCorrection(y, X, P, type = "discrete", intercept=FALSE)
summary(c3)

Run the code above in your browser using DataLab