sigint: Estimating the parameters of the canonical discrete crisis bargaining game.

Description

This function fits the Lewis and Schultz (2003) model to data using either the pseudo-likelihood (PL) or nested-pseudo likelihood (NPL) method from Crisman-Cox and Gibilisco (2018). Throughout, we refer to the data as containing \(D\) games, where each game is observed one or more times.

Usage

sigint(
  formulas,
  data,
  subset,
  na.action,
  fixed.par = list(),
  method = c("npl", "pl"),
  npl.maxit = 25,
  npl.tol = 1e-07,
  npl.trace = FALSE,
  start.beta,
  maxlik.method = "NR",
  phat,
  phat.formulas,
  pl.vcov = FALSE,
  phat.vcov,
  seed = 12345,
  maxlik.options = list()
)

Arguments

formulas

a Formula object four variables on the left-hand side and seven (7) separate right-hand sides. See "Details" and examples below.

data

a data frame containing the variables used to fit the model. Each row of the data frame describes an individual game \(d = 1, 2, ..., D\). Each row \(d\) should be a summary of all of the within-game observations for game \(d\). See "Details" for more information.

subset

an optional logical expression to specify a subset of observations to be used in fitting the model.

na.action

how do deal with missing data (NAs). Defaults to the na.action setting of options (typically na.omit).

fixed.par

a list with up to seven (7) named elements for normalizing payoffs to non-zero values. Names must match a payoff name as listed in "Details." Each named element should contain a single number that is the fixed (not estimated) value of that payoff. For example, to fix each side's victory-without-fighting payoff to 1 use fixed.par=list(VA=1, VB=1) and set their portions of the formulas to zero. To normalize a payoff to zero, you only need to specify it has a zero in the formulas.

method

whether to use the nested-pseudo-likelihood ("npl", default) or the pseudo-likelihood method for fitting the model. See "Details" for more information.

npl.maxit

maximum number of outer-loop iterations to be used when fitting the NPL. See "Details" for more information.

npl.tol

Convergence criteria for the NPL. When the estimates change by less than this amount, convergence is considered successful.

npl.trace

logical. Should the NPL's progress be printed to screen?

start.beta

starting values for the model coefficients as a single vector. If missing, random values are drawn from a normal distribution with mean zero and standard deviation 0.05.

maxlik.method

method used by maxLik to fit the model. Default is Newton-Raphson ("NR"). See maxLik for additional details. At this time only "NR", "BFGS", and "Nelder-Mead" are available.

phat

a list containing two vectors: PRhat and PFhat. These are the first-stage estimates that \(B\) resists a threat and that \(A\) follows through on a threat, respectively. If missing, they will be estimated by a randomForest with default options. See "Details" for more information.

phat.formulas

if phat are missing, you can supply formulas to estimate them. Should be a Formulas object containing no left-hand side and 1-2 right hand sides. If one right-hand side is given, the same covariates are used to estimate both PRhat and PFhat. Otherwise, the first RHS is used to generate PRhat, while the second RHS generates PFhat. If no formulas are provided and phat is missing, all the covariates used in formulas argument and used here. See "Details" for more information.

pl.vcov

number of bootstrap iterations to generate phat.vcov. If less than 0 or FALSE (default), the pseudo-likelihood covariance is not estimated. Only used if method = "pl".

phat.vcov

a covariance matrix for the estimates PRhat and PFhat. If missing and pl.vcov = TRUE and phat is missing, it will be estimated by bootstrapping the random forest used to fit phat.

seed

integer. Used to set the seed for the random forest and for drawing the the starting values. The PL can be sensitive to starting value, so this makes results reproducible. The NPL is less sensitive, but we always recommend checking the first order conditions.

maxlik.options

a list of options to be passed to maxLik for fitting the model.

Value

An object of class sigfit, containing:

coefficients

A vector of estimated model parameters.

vcov

Estimated variance-covariance matrix. When pl.vcov = FALSE, this slot is omitted.

utilities

Each actor's utilities at the estimated values.

fixed.par

The fixed utilities if specified in the call.

logLik

Final log-likelihood value of the model.

gradient

First derivative values at the estimated parameters.

Phat

List of two elements

PRhat The first stage estimates of the probability that \(B\) resists (method = "pl") or the final estimates that \(B\) resists (if method = "npl")
PFhat The first stage estimates of the probability that \(A\) stands firms given that \(A\) challenged (method = "pl") or the final estimates that \(A\) stands firms given that \(A\) challenged (if method = "npl")

Note that PRhat will only be an equilibrium if method = "npl" and the NPL convergences

user.phat

Logical. Did the user provide phat?

start.beta

The vector of starting values used in the PL optimization.

call

The call used to produce the object.

model

The data frame used to fit the model.

method

The method ("pl" or "npl") used to fit the model.

maxlik.method

The optimization used by maxLik to fit the model.

maxlik.code

The convergence code returned by maxLik.

maxlik.message

The convergence message returned by maxLik.

Additionally, when method = "npl", the following are also included in the sigfit object.

npl.iter: Number of best response iterations used in fitting the NPL.
npl.eval: Maximum difference between the parameters at the last two NPL iterations. If the NPL method converged, this should be less than npl.tol specified in the function call.
eq.constraint: Maximum equilibrium constraint violation.

Details

The model corresponds to an extensive-form, discrete-crisis-bargaining game from Lewis and Schultz (2003):

 
  .       A 
  .      / \ 
  .     /   \ 
  .    /     \
  .   S_A     B 
  .    0     / \
  .         /   \
  .        /     \
  .      V_A      A 
  .      C_B     / \ 
  .             /   \
  .            /     \
  .     W_A + e_A    a + e_a 
  .     W_B + e_B    V_B

If \(A\) chooses not to challenge \(B\), then the game ends at the leftmost node (\(SQ\)) and payoffs are \(S_A\) and 0 to players \(A\) and \(B\), respectively. If \(A\) challenges \(B\), \(B\) can concede or resist. If \(B\) concedes, the game ends at \(CD\) with payoffs \(V_A\) and \(C_B\). However, if \(B\) resists, \(A\) decides to stand firm, which ends the game at \(SF\) with payoffs \(W_A + \epsilon_A\) and \(W_B + \epsilon_B\). Finally, if \(A\) decides to back down in the face of \(B\)'s resistance, then the game ends at the rightmost node \(BD\), with payoffs \(a + \epsilon_a\) and \(V_B\).

The seven right-hand formulas that are specified in the formula argument correspond to the regressors to be placed in \(S_A, V_A, C_B, W_A, W_B, a\), and \(V_B\), respectively. The model is unidentified if any regressor (including a constant term) is included in all the formulas for each player (Lewis and Schultz 2003). Often the easiest way to meet this requirement is set one formula per player to 0. When an identification problem is detected, an error is issued. For example, the syntax for the formula argument could be:

formulas = sq + cd + sf + bd ~ x1 + 0 | x2 | x2 | x1 + x2 | x1 | 1 | 0)

Where:

sq + cd + sf + bd are the tallies of how many times each outcome is observed for each observation. When the game is only observed once, that observation will be a 1 and three 0s. When the game is observed multiple times, these variables should count the number of times each outcome is observed. They need to be in the order of \(SQ\), \(CD\), \(SF\), \(BD\).
\(S_A\) is a function of the variable x1 and no constant term.
\(V_A\) is a function of the variable x2 and a constant term.
\(C_B\) is a function of the variable x2 and a constant term.
\(W_A\) is a function of the variables x1, x2 and a constant term.
\(W_B\) is a function of the variable x1 and a constant term.
\(a\) is a constant term.
\(V_B\) is fixed to 0 (or a non-zero value set by fixed.par.

Each row of the data frame should be a summary of the covariates and outcomes associated with that particular game. When each game is observed only once, then this will resemble an ordinary dyad-time data frame. However, if there are multiple observations per game, then each row should be a summary of all the data associated with that game. For example, if there are \(D\) games in the data, where each is observed \(T_d\) times, then the data frame should have \(D\) rows. The four columns making up the dependent variable will denote the frequencies of each outcome for game \(d\), such that sq\(_d\) + cd\(_d\) + sf\(_d\) + bd\(_d = T_d\). The covariates in row \(d\) should be summary statistics for the exogenous variables (e.g., mean, median, mode, first observation).

The model is first fit using a pseudo-likelihood estimator. This approach requires first stage estimation of the probability that \(B\) resists and the probability that \(A\) fights conditional on \(B\) choosing to resist. These first stage estimates should be flexible and we recommend that users fit a flexible semi-parametric or non-parametric model to produce them. If these estimates are produced by the analyst prior to using this function, then they can be provided by providing a list to the phat argument. This list should contain two named elements

PRhat is the probability that \(B\) resists. This should be a vector of probabilities with one estimated probability for each observation.
PFhat is the probability that \(A\) stands firm conditional on \(B\) resisting. This should be a vector of probabilities with one estimated probability for each observation.

If the user leaves the phat argument empty, then these first-stage estimates are produced internally using the randomForest function. Users wanting to use the random forest, can supply a formula for it using the argument phat.formulas. This argument can take a formula with nothing on the left-hand side and 1-2 right-hand sides. If two right-hand sides are provided then the first is used to generate PRhat, and the second is used for PFhat. If only one right-hand side is provided, it is used for both. Some examples:

phat.formulas = ~ x1 + x2 predict PRhat and PFhat using x1 and x2.
phat.formulas = ~ x1 + x2 | x1 + x2 predict PRhat and PFhat using x1 and x2
phat.formulas = ~ x1 + x2 | x1 predict PRhat using x1 and x2, but predict PFhat using only x1.

If both phat and phat.formula are missing, then a random forest is fit using all the exogenous variables listed in the formulas argument

If method = "npl", then estimation continues. For each iteration of the NPL, the estimates of PRhat and PFhat are updated by one best-response iteration using the current parameter estimates. The model is then refit using these updated choice probabilities. This process continues until the maximum absolute change in parameters and choice probabilities is less than npl.tol (default, 1e-7), or the number of outer iterations exceeds npl.maxit (default, 25). In the latter case, a warning is produced.

If pseudo-likelihood (method="pl") is used, then pl.vcov is checked. There are four possibilities here:

pl.vcov = FALSE (default), then no covariance matrix or standard errors are returned, only the point estimates.
pl.vcov > 0 and phat.vcov is supplied, then phat.vcov is used to estimate the PL's covariance matrix.
pl.vcov > 0, phat.vcov is missing, and phat is missing, then the random forest used to estimate PRhat and PFhat is bootstrapped (simple, nonparametric bootstrap) pl.vcov times.
pl.vcov > 0, phat.vcov is missing, and phat is not missing, then an error is returned.

References

Casey Crisman-Cox and Michael Gibilisco. 2019. "Estimating Signaling Games in International Relations: Problems and Solutions." Political Science Research and Methods. Online First.

Jeffrey B. Lewis and Kenneth A. Schultz. 2003. "Revealing Preferences: Empirical Estimation of a Crisis Bargaining Game with Incomplete Information." Political Analysis 11:345--367.

Examples

Run this code

# NOT RUN {
data("sanctionsData")
f1 <- sq+cd+sf+bd ~ sqrt(senderecondep) + senderdemocracy + contig + ally -1|#SA
                    anticipatedsendercosts|#VA
                    sqrt(targetecondep) + anticipatedtargetcosts + contig + ally|#CB
                    sqrt(senderecondep) + senderdemocracy + lncaprat | #barWA
                    targetdemocracy + lncaprat| #barWB
                    senderdemocracy| #bara
                    -1#VB

## Using Nested-Pseudo Likelihood  with default first stage     
# }
# NOT RUN {
fit1 <- sigint(f1, data=sanctionsData, npl.trace=TRUE)
summary(fit1)
# }
# NOT RUN {

## Using Pseudo Likelihood with user supplied first stage
Phat <- list(PRhat=sanctionsData$PRnpl, PFhat=sanctionsData$PFnpl)
fit2 <- sigint(f1, data=sanctionsData, method="pl", phat=Phat)
summary(fit2)

## Using Pseudo Likelihood with user made first stage and user covariance
## SIGMA is a bootstrapped first-stage covariance matrix (not provided)
# }
# NOT RUN {
fit3 <- sigint(f1, data=sanctionsData, method="pl", phat=Phat, phat.vcov=SIGMA, pl.vcov=TRUE)
summary(fit3)
# }
# NOT RUN {
## Using Pseudo Likelihood with default first stage and 
## bootstrapped standard errors for the first stage covariance
# }
# NOT RUN {
fit4 <- sigint(f1, data=sanctionsData, method="pl", pl.vcov=25) 
summary(fit4)
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab