This function fits the Lewis and Schultz (2003) model to data using either the pseudo-likelihood (PL) or nested-pseudo likelihood (NPL) method from Crisman-Cox and Gibilisco (2018). Throughout, we refer to the data as containing \(D\) games, where each game is observed one or more times.
sigint(
formulas,
data,
subset,
na.action,
fixed.par = list(),
method = c("npl", "pl"),
npl.maxit = 25,
npl.tol = 1e-07,
npl.trace = FALSE,
start.beta,
maxlik.method = "NR",
phat,
phat.formulas,
pl.vcov = FALSE,
phat.vcov,
seed = 12345,
maxlik.options = list()
)
a Formula
object four variables on the left-hand side and
seven (7) separate right-hand sides. See "Details" and examples below.
a data frame containing the variables used to fit the model. Each row of the data frame describes an individual game \(d = 1, 2, ..., D\). Each row \(d\) should be a summary of all of the within-game observations for game \(d\). See "Details" for more information.
an optional logical expression to specify a subset of observations to be used in fitting the model.
how do deal with missing data (NA
s). Defaults to the
na.action
setting of options
(typically na.omit
).
a list with up to seven (7) named elements for normalizing payoffs to non-zero values.
Names must match a payoff name as listed in "Details."
Each named element should contain a single number that is the fixed (not estimated) value of that payoff.
For example, to fix each side's victory-without-fighting payoff to 1
use fixed.par=list(VA=1, VB=1)
and set their portions of the formulas
to zero.
To normalize a payoff to zero, you only need to specify it has a zero in the formulas
.
whether to use the nested-pseudo-likelihood ("npl"
, default) or
the pseudo-likelihood method for fitting the model. See "Details" for more
information.
maximum number of outer-loop iterations to be used when fitting the NPL. See "Details" for more information.
Convergence criteria for the NPL. When the estimates change by less than this amount, convergence is considered successful.
logical. Should the NPL's progress be printed to screen?
starting values for the model coefficients as a single vector. If missing, random values are drawn from a normal distribution with mean zero and standard deviation 0.05.
a list containing two vectors: PRhat
and PFhat
.
These are the first-stage estimates that \(B\) resists a threat and that \(A\)
follows through on a threat, respectively.
If missing, they will be estimated by a randomForest
with default options.
See "Details" for more information.
if phat
are missing, you can supply formulas to
estimate them. Should be a Formulas object containing no left-hand side and
1-2 right hand sides. If one right-hand side is given, the same
covariates are used to estimate both PRhat
and PFhat
.
Otherwise, the first RHS is used to generate PRhat
, while the second RHS
generates PFhat
.
If no formulas are provided and phat
is
missing, all the covariates used in formulas argument and used here. See
"Details" for more information.
number of bootstrap iterations to generate phat.vcov
.
If less than 0
or FALSE
(default), the pseudo-likelihood
covariance is not estimated. Only used if method = "pl"
.
a covariance matrix for the estimates PRhat
and PFhat
.
If missing and pl.vcov = TRUE
and phat
is missing, it will be
estimated by bootstrapping the random forest used to fit phat
.
integer. Used to set the seed for the random forest and for drawing the the starting values. The PL can be sensitive to starting value, so this makes results reproducible. The NPL is less sensitive, but we always recommend checking the first order conditions.
a list of options to be passed to
maxLik
for fitting the model.
An object of class sigfit
, containing:
coefficients
A vector of estimated model parameters.
vcov
Estimated variance-covariance matrix. When pl.vcov = FALSE
, this slot is omitted.
utilities
Each actor's utilities at the estimated values.
fixed.par
The fixed utilities if specified in the call.
logLik
Final log-likelihood value of the model.
gradient
First derivative values at the estimated parameters.
Phat
List of two elements
PRhat
The first stage estimates of the probability that
\(B\) resists (method = "pl"
) or the final estimates that
\(B\) resists (if method = "npl"
)
PFhat
The first stage estimates of the probability that
\(A\) stands firms given that \(A\) challenged (method = "pl"
) or
the final estimates that \(A\) stands firms given that \(A\) challenged
(if method = "npl"
)
PRhat
will only be an equilibrium if method = "npl"
and the NPL convergencesuser.phat
Logical. Did the user provide phat?
start.beta
The vector of starting values used in the PL optimization.
call
The call used to produce the object.
model
The data frame used to fit the model.
method
The method ("pl"
or "npl"
) used to fit the model.
maxlik.method
The optimization used by maxLik
to fit the model.
maxlik.code
The convergence code returned by maxLik
.
maxlik.message
The convergence message returned by maxLik
.
Additionally, when method = "npl"
, the following are also included in the sigfit
object.
npl.iter
Number of best response iterations used in fitting the NPL.
npl.eval
Maximum difference between the parameters at the last two NPL iterations. If the NPL method converged, this should be less than npl.tol
specified in the function call.
eq.constraint
Maximum equilibrium constraint violation.
The model corresponds to an extensive-form, discrete-crisis-bargaining game from Lewis and Schultz (2003):
. A . / \ . / \ . / \ . S_A B . 0 / \ . / \ . / \ . V_A A . C_B / \ . / \ . / \ . W_A + e_A a + e_a . W_B + e_B V_B
If \(A\) chooses not to challenge \(B\), then the game ends at the leftmost node (\(SQ\)) and payoffs are \(S_A\) and 0 to players \(A\) and \(B\), respectively. If \(A\) challenges \(B\), \(B\) can concede or resist. If \(B\) concedes, the game ends at \(CD\) with payoffs \(V_A\) and \(C_B\). However, if \(B\) resists, \(A\) decides to stand firm, which ends the game at \(SF\) with payoffs \(W_A + \epsilon_A\) and \(W_B + \epsilon_B\). Finally, if \(A\) decides to back down in the face of \(B\)'s resistance, then the game ends at the rightmost node \(BD\), with payoffs \(a + \epsilon_a\) and \(V_B\).
The seven right-hand formulas that are specified in the formula argument correspond to the regressors to be placed in \(S_A, V_A, C_B, W_A, W_B, a\), and \(V_B\), respectively. The model is unidentified if any regressor (including a constant term) is included in all the formulas for each player (Lewis and Schultz 2003). Often the easiest way to meet this requirement is set one formula per player to 0. When an identification problem is detected, an error is issued. For example, the syntax for the formula argument could be:
formulas = sq + cd + sf + bd ~ x1 + 0 | x2 | x2 | x1 + x2 | x1 | 1 | 0)
Where:
sq + cd + sf + bd
are the tallies of how many
times each outcome is observed for each observation. When the game is only
observed once, that observation will be a 1 and three 0s. When the game is
observed multiple times, these variables should count the number of times
each outcome is observed. They need to be in the order of \(SQ\),
\(CD\), \(SF\), \(BD\).
\(S_A\) is a function of the variable x1
and no constant term.
\(V_A\) is a function of the variable x2
and a constant term.
\(C_B\) is a function of the variable x2
and a constant term.
\(W_A\) is a function of the variables x1
, x2
and a constant term.
\(W_B\) is a function of the variable x1
and a constant term.
\(a\) is a constant term.
\(V_B\) is fixed to 0 (or a non-zero value set by fixed.par
.
Each row of the data frame should be a summary of the covariates and outcomes associated with that particular game.
When each game is observed only once, then this will resemble an ordinary dyad-time data frame.
However, if there are multiple observations per game, then each row should be a summary of all the data associated
with that game.
For example, if there are \(D\) games in the data, where each is observed \(T_d\) times, then the data frame
should have \(D\) rows.
The four columns making up the dependent variable will denote the frequencies of each outcome for game \(d\),
such that sq
\(_d\) + cd
\(_d\) + sf
\(_d\) + bd
\(_d = T_d\).
The covariates in row \(d\) should be summary statistics for the exogenous variables (e.g., mean, median, mode, first observation).
The model is first fit using a pseudo-likelihood estimator. This approach
requires first stage estimation of the probability that \(B\) resists and
the probability that \(A\) fights conditional on \(B\) choosing to
resist. These first stage estimates should be flexible and we recommend
that users fit a flexible semi-parametric or non-parametric model to
produce them. If these estimates are produced by the analyst prior to using
this function, then they can be provided by providing a list to the
phat
argument. This list should contain two named elements
PRhat
is the probability that \(B\) resists. This should be
a vector of probabilities with one estimated probability for each
observation.
PFhat
is the probability that \(A\) stands firm
conditional on \(B\) resisting. This should be a vector of probabilities
with one estimated probability for each observation.
If the user leaves the phat
argument empty, then these first-stage
estimates are produced internally using the
randomForest
function.
Users wanting to use the
random forest, can supply a formula for it using the argument
phat.formulas
.
This argument can take a formula with nothing on the
left-hand side and 1-2 right-hand sides.
If two right-hand sides are
provided then the first is used to generate PRhat
, and the second is
used for PFhat
.
If only one right-hand side is provided, it is used
for both. Some examples:
phat.formulas = ~ x1 + x2
predict PRhat
and PFhat
using x1
and x2
.
phat.formulas = ~ x1 + x2 | x1 + x2
predict PRhat
and
PFhat
using x1
and x2
phat.formulas = ~ x1 + x2 | x1
predict PRhat
using x1
and x2
, but
predict PFhat
using only x1
.
If both phat
and phat.formula
are missing, then a random forest is fit using all the
exogenous variables listed in the formulas argument
If method = "npl"
, then estimation continues.
For each iteration of the NPL, the estimates of PRhat
and PFhat
are updated
by one best-response iteration using the current parameter estimates.
The model is then refit using these updated choice probabilities.
This process continues until the maximum absolute change in
parameters and choice probabilities is less than npl.tol
(default, 1e-7
), or
the number of outer iterations exceeds npl.maxit
(default, 25
).
In the latter case, a warning is produced.
If pseudo-likelihood (method="pl"
) is used, then
pl.vcov
is checked.
There are four possibilities here:
pl.vcov = FALSE
(default), then no covariance matrix or
standard errors are returned, only the point estimates.
pl.vcov > 0
and phat.vcov
is supplied,
then phat.vcov
is used to estimate the PL's covariance matrix.
pl.vcov > 0
, phat.vcov
is missing, and phat
is missing, then the random forest used to estimate PRhat
and
PFhat
is bootstrapped (simple, nonparametric bootstrap) pl.vcov
times.
pl.vcov > 0
, phat.vcov
is missing, and phat
is not
missing, then an error is returned.
Casey Crisman-Cox and Michael Gibilisco. 2019. "Estimating Signaling Games in International Relations: Problems and Solutions." Political Science Research and Methods. Online First.
Jeffrey B. Lewis and Kenneth A. Schultz. 2003. "Revealing Preferences: Empirical Estimation of a Crisis Bargaining Game with Incomplete Information." Political Analysis 11:345--367.
# NOT RUN {
data("sanctionsData")
f1 <- sq+cd+sf+bd ~ sqrt(senderecondep) + senderdemocracy + contig + ally -1|#SA
anticipatedsendercosts|#VA
sqrt(targetecondep) + anticipatedtargetcosts + contig + ally|#CB
sqrt(senderecondep) + senderdemocracy + lncaprat | #barWA
targetdemocracy + lncaprat| #barWB
senderdemocracy| #bara
-1#VB
## Using Nested-Pseudo Likelihood with default first stage
# }
# NOT RUN {
fit1 <- sigint(f1, data=sanctionsData, npl.trace=TRUE)
summary(fit1)
# }
# NOT RUN {
## Using Pseudo Likelihood with user supplied first stage
Phat <- list(PRhat=sanctionsData$PRnpl, PFhat=sanctionsData$PFnpl)
fit2 <- sigint(f1, data=sanctionsData, method="pl", phat=Phat)
summary(fit2)
## Using Pseudo Likelihood with user made first stage and user covariance
## SIGMA is a bootstrapped first-stage covariance matrix (not provided)
# }
# NOT RUN {
fit3 <- sigint(f1, data=sanctionsData, method="pl", phat=Phat, phat.vcov=SIGMA, pl.vcov=TRUE)
summary(fit3)
# }
# NOT RUN {
## Using Pseudo Likelihood with default first stage and
## bootstrapped standard errors for the first stage covariance
# }
# NOT RUN {
fit4 <- sigint(f1, data=sanctionsData, method="pl", pl.vcov=25)
summary(fit4)
# }
# NOT RUN {
# }
Run the code above in your browser using DataLab