stabit2: Structural Matching Model to correct for sample selection bias in two-sided matching markets

Description

The function provides a Gibbs sampler for a structural matching model that corrects for sample selection bias when the selection process is a two-sided matching game; i.e., a matching of students to colleges.

The structural model consists of a selection and an outcome equation. The Selection Equation determines which matches are observed ($D=1$) and which are not ($D=0$). $$ \begin{array}{lcl} D &= & 1[V \in \Gamma] \\ V &= & W\beta + \eta \end{array} $$ Here, $V$ is a vector of latent valuations of all feasible matches, ie observed and unobserved, and $1[.]$ is the Iverson bracket. A match is observed if its match valuation is in the set of valuations $\Gamma$ that satisfy the equilibrium condition (see Sorensen, 2007). The match valuation $V$ is a linear function of $W$, a matrix of characteristics for all feasible groups, and $\eta$, a vector of random errors. $\beta$ is a paramter vector to be estimated.

The Outcome Equation determines the outcome for observed matches. The dependent variable can either be continuous or binary, dependent on the value of the binary argument. In the binary case, the dependent variable $R$ is determined by a threshold rule for the latent variable $Y$. $$ \begin{array}{lcl} R &= & 1[Y > c] \\ Y &= & X\alpha + \epsilon \end{array} $$ Here, $Y$ is a linear function of $X$, a matrix of characteristics for observed matches, and $\epsilon$, a vector of random errors. $\alpha$ is a paramter vector to be estimated.

The structural model imposes a linear relationship between the error terms of both equations as $\epsilon = \kappa\eta + \nu$, where $\nu$ is a vector of random errors and $\kappa$ is the covariance paramter to be estimated. If $\kappa$ were zero, the marginal distributions of $\epsilon$ and $\eta$ would be independent and the selection problem would vanish. That is, the observed outcomes would be a random sample from the population of interest.

Usage

stabit2(OUT, SEL = NULL, colleges = NULL, students = NULL, outcome, selection, binary = FALSE, niter, gPrior = FALSE, censored = 1, thin = 1)

Arguments

OUT

data frame with characteristics of all observed matches, including market identifier m.id, college identifier c.id and student identifier s.id.

SEL

optional: data frame with characteristics of all observed and unobserved matches, including market identifier m.id, college identifier c.id and student identifier s.id.

colleges

character vector of variable names for college characteristics. These variables carry the same value for any college.

students

character vector of variable names for student characteristics. These variables carry the same value for any student.

outcome

formula for match outcomes.

selection

formula for match valuations.

binary

logical: if TRUE outcome variable is taken to be binary; if FALSE outcome variable is taken to be continuous.

niter

number of iterations to use for the Gibbs sampler.

gPrior

logical: if TRUE the g-prior (Zellner, 1986) is used for the variance-covariance matrix. (Not yet implemented)

censored

draws of the kappa parameter that estimates the covariation between the error terms in selection and outcome equation are 0:not censored, 1:censored from below, 2:censored from above.

thin

integer indicating the level of thinning in the MCMC draws. The default thin=1 saves every draw, thin=2 every second, etc.

Value

stabit2 returns a list with the following items. returns a list with the following items.

References

Sorensen, M. (2007). How Smart is Smart Money? A Two-Sided Matching Model of Venture Capital. Journal of Finance, 62 (6): 2725-2762.

Examples

Run this code

## --- SIMULATED EXAMPLE ---
## Not run: 
# ## 1. Simulate two-sided matching data for 20 markets (m=20) with 100 students
# ##    (nStudents=100) per market and 20 colleges with quotas of 5 students, each
# ##    (nSlots=rep(5,20)).
# 
# xdata <- stabsim2(m=20, nStudents=100, nSlots=rep(5,20), 
#   colleges = "c1",
#   students = "s1",
#   outcome = ~ c1:s1 + eta + nu,
#   selection = ~ -1 + c1:s1 + eta
# )
# head(xdata$OUT)
# 
# 
# ## 2-a. Bias from sorting
#  lm1 <- lm(y ~ c1:s1, data=xdata$OUT)
#  summary(lm1)
# 
# ## 2-b. Cause of the bias
#  with(xdata$OUT, cor(c1*s1, eta))
# 
# ## 2-c. Correction for sorting bias
#  lm2a <- lm(V ~ -1 + c1:s1, data=xdata$SEL); summary(lm2a)
#  etahat <- lm2a$residuals[xdata$SEL$D==1]
#  
#  lm2b <- lm(y ~ c1:s1 + etahat, data=xdata$OUT)
#  summary(lm2b)
# 
# 
# ## 3. Correction for sorting bias when match valuation V is unobserved
# 
# ## 3-a. Run Gibbs sampler (when SEL is given)
#  fit2 <- stabit2(OUT = xdata$OUT, 
#            SEL = xdata$SEL,
#            outcome = y ~ c1:s1, 
#            selection = ~ -1 + c1:s1,
#            niter=1000
#  )
# 
# ## 3-b. Run Gibbs sampler (when SEL is not given)
#  fit2 <- stabit2(OUT = xdata$OUT, 
#            colleges = "c1",
#            students = "s1",
#            outcome = y ~ c1:s1, 
#            selection = ~ -1 + c1:s1,
#            niter=1000
#  )
# 
# ## 4-a. Get marginal effects (for linear model)
#  fit2$coefs
#  
# ## 4-b. Get marginal effects (for probit)
#  #mfx(fit2)
#  
#  
# ## 5. Plot MCMC draws for coefficients
#  plot(fit2$draws$alphadraws[1,], type="l")
#  plot(fit2$draws$betadraws[1,], type="l")
#  plot(fit2$draws$kappadraws[1,], type="l")
# ## End(Not run)

Run the code above in your browser using DataLab