Learn R Programming

matchingMarkets (version 0.1-1)

stabit: Structural Matching Model to correct for sample selection bias

Description

The function provides a Gibbs sampler for a structural matching model that corrects for sample selection bias when the selection process is a one-sided matching game; that is, group/coalition formation.

The input is individual-level data of all group members from one-sided matching marktes; that is, from group/coalition formation games.

In a first step, the function generates a model matrix with characteristics of all feasible groups of the same size as the observed groups in the market.

For example, in the stable roommates problem with $n=4$ students ${1,2,3,4}$ sorting into groups of 2, we have ${4 \choose 2}=6$ feasible groups: (1,2)(3,4) (1,3)(2,4) (1,4)(2,3).

In the group formation problem with $n=6$ students ${1,2,3,4,5,6}$ sorting into groups of 3, we have ${6 \choose 3}=20$ feasible groups. For the same students sorting into groups of sizes 2 and 4, we have ${6 \choose 2} + {6 \choose 4}=30$ feasible groups.

The structural model consists of a selection and an outcome equation. The Selection Equation determines which matches are observed ($D=1$) and which are not ($D=0$). $$\begin{array}{lcl} D &= & 1[V \in \Gamma] \ V &= & W\alpha + \eta \end{array}$$ Here, $V$ is a vector of latent valuations of all feasible matches, ie observed and unobserved, and $1[.]$ is the Iverson bracket. A match is observed if its match valuation is in the set of valuations $\Gamma$ that satisfy the equilibrium condition (see Klein, 2014). This condition differs for matching games with transferable and non-transferable utility and can be specified using the NTU argument. The match valuation $V$ is a linear function of $W$, a matrix of characteristics for all feasible groups, and $\eta$, a vector of random errors. $\alpha$ is a paramter vector to be estimated.

The Outcome Equation determines the outcome for observed matches. The dependent variable can either be continuous or binary, dependent on the value of the binary argument. In the binary case, the dependent variable $R$ is determined by a threshold rule for the latent variable $Y$. $$\begin{array}{lcl} R &= & 1[Y > c] \ Y &= & X\beta + \epsilon \end{array}$$ Here, $Y$ is a linear function of $X$, a matrix of characteristics for observed matches, and $\epsilon$, a vector of random errors. $\beta$ is a paramter vector to be estimated.

The structural model imposes a linear relationship between the error terms of both equations as $\epsilon = \delta\eta + \xi$, where $\xi$ is a vector of random errors and $\delta$ is the covariance paramter to be estimated. If $\delta$ were zero, the marginal distributions of $\epsilon$ and $\eta$ would be independent and the selection problem would vanish. That is, the observed outcomes would be a random sample from the population of interest.

Usage

stabit(x, m.id = "m.id", g.id = "g.id", R = "R", selection = NULL,
  outcome = NULL, roommates = FALSE, simulation = "none", seed = 123,
  max.combs = Inf, method = "NTU", binary = FALSE, offsetOut = 0,
  offsetSel = 0, marketFE = FALSE, censored = 0, gPrior = FALSE,
  dropOnes = FALSE, interOut = 0, interSel = 0, niter = 10)

Arguments

x
data frame with individual-level characteristics of all group members including market- and group-identifiers.
m.id
character string giving the name of the market identifier variable. Defaults to "m.id".
g.id
character string giving the name of the group identifier variable. Defaults to "g.id".
R
dependent variable in outcome equation. Defaults to "R".
selection
list containing variables and pertaining operators in the selection equation. The format is operation = "variable". See the Details and Examples sections.
outcome
list containing variables and pertaining operators in the outcome equation. The format is operation = "variable". See the Details and Examples sections.
roommates
logical: if TRUE data is assumed to come from a roomate game. This means that groups are of size two and the model matrix is prepared for individual-level analysis (peer-effects estimation). If FALSE (which is the default) data i
simulation
should the values of dependent variables in selection and outcome equations be simulated? Options are "none" for no simulation, "NTU" for non-transferable utility matching, "TU" for transferable utility or "ran
seed
integer setting the state for random number generation if simulation=TRUE.
max.combs
integer (divisible by two) giving the maximum number of feasible groups to be used for generating group-level characteristics.
method
estimation method to be used. Either "NTU" or "TU" for selection correction using non-transferable or transferable utility matching as selection rule; "outcome" for estimation of the outcome equation only; or "
binary
logical: if TRUE outcome variable is taken to be binary; if FALSE outcome variable is taken to be continuous.
offsetOut
vector of integers indicating the indices of columns in X for which coefficients should be forced to 1. Use 0 for none.
offsetSel
vector of integers indicating the indices of columns in W for which coefficients should be forced to 1. Use 0 for none.
marketFE
logical: if TRUE market-level fixed effects are used in outcome equation; if FALSE no market fixed effects are used.
censored
draws of the delta parameter that estimates the covariation between the error terms in selection and outcome equation are 0:not censored, 1:censored from below, 2:censored from above.
gPrior
logical: if TRUE the g-prior (Zellner, 1986) is used for the variance-covariance matrix.
dropOnes
logical: if TRUE one-group-markets are exluded from estimation.
interOut
two-colum matrix indicating the indices of columns in X that should be interacted in estimation. Use 0 for none.
interSel
two-colum matrix indicating the indices of columns in W that should be interacted in estimation. Use 0 for none.
niter
number of iterations to use for the Gibbs sampler.

Value

  • stabit returns a list with the following items.
  • model.list
  • model.frame
  • draws
  • coefs

Details

Operators for variable transformations in selection and outcome arguments. [object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

References

Klein, T. (2014). Stable matching in microcredit: Implications for market design & econometric analysis, PhD thesis, University of Cambridge.

Zellner, A. (1986). On assessing prior distributions and Bayesian regression analysis with g-prior distributions, volume 6, pages 233--243. North-Holland, Amsterdam.

Examples

Run this code
#########################################
## MODEL FRAMES (method="model.frame") ##
#########################################

## --- ROOMMATES GAME ---
## 1. Simulate one-sided matching data for 3 markets (m=3) with 3 groups
##    per market (gpm=3) and 2 individuals per group (ind=2)
 idata <- stabsim(m=3, ind=2, gpm=3)
## 2. Obtain the model frame
# s1 <- stabit(x=idata, selection = list(add="pi", ieq="wst"),
     outcome = list(add="pi", ieq="wst"),
     method="model.frame", simulation="TU", roommates=TRUE)

## --- GROUP/COALITION FORMATION (I) ---
## 1. Simulate one-sided matching data for 3 markets (m=3) with 2 groups
## per market (gpm=2) and 2 to 4 individuals per group (ind=2:4)
 idata <- stabsim(m=3, ind=2:4, gpm=2)
## 2. Obtain the model frame
 s2 <- stabit(x=idata, selection = list(add="pi", ieq="wst"),
      outcome = list(add="pi", ieq="wst"),
      method="model.frame", simulation="NTU", roommates=FALSE)

## --- GROUP/COALITION FORMATION (II) ---
## 1. Load baac00 data from the Townsend Thai project
 data(baac00)
## 2. Obtain the model frame
 s3 <- stabit(x=baac00, selection = list(add="pi", int="pi", ieq="wst", ive="occ"),
      outcome = list(add="pi", int="pi", ieq="wst", ive="occ",
      add=c("loan_size","loan_size2","lngroup_agei")),
      method="model.frame", simulation="none")

###############################
## ESTIMATION (method="NTU") ##
###############################

## --- SIMULATED EXAMPLE ---
## 1. Simulate one-sided matching data for 3 markets (m=3) with 2 groups
##    per market (gpm=2) and 2 to 4 individuals per group (ind=2:4)
 idata <- stabsim(m=3, ind=2:4, gpm=2)
## 2. Run Gibbs sampler
 fit1 <- stabit(x=idata, selection = list(add="pi",ieq="wst"),
        outcome = list(add="pi",ieq="wst"),
        method="NTU", simulation="NTU", binary=FALSE, niter=2000)
## 3. Get results
 names(fit1)

## --- REPLICATION, Klein (2014), Table 5 ---
## 1. Load data
 data(baac00)
## 2. Run Gibbs sampler
 fit2 <- stabit(x=baac00, selection = list(add="pi",int="pi",ive="occ",ieq="wst"),
        outcome = list(add="pi",int="pi",ive="occ",ieq="wst",
        add=c("loan_size","loan_size2","lngroup_agei")),
        method="NTU", binary=TRUE, gPrior=TRUE, marketFE=TRUE, niter=2000)
## 3. Get results
 names(fit2)

Run the code above in your browser using DataLab