The model corresponds to the following extensive-form
game, described in Signorino (2003):
. 1
. /\
. / \
. / \ 2
. u11 /\
. / \
. / \
. u13 u14
. 0 u24 If Player 1 chooses L, the game ends and Player 1
receives payoffs of u11. (Player 2's utilities in this
case cannot be identified in a statistical model.) If
Player 1 chooses L, then Player 2 can choose L, resulting
in payoffs of u13 for Player 1 and 0 for Player 2, or R,
with payoffs of u14 for 1 and u24 for 2.
The four equations specified in the function's
formulas
argument correspond to the regressors to
be placed in u11, u13, u14, and u24 respectively. If
there is any regressor (including the constant) placed in
all of u11, u13, and u14, egame12
will stop and
issue an error message, because the model is then
unidentified (see Lewis and Schultz 2003). There are two
equivalent ways to express the formulas passed to this
argument. One is to use a list of four formulas, where
the first contains the response variable(s) (discussed
below) on the left-hand side and the other three are
one-sided. For instance, suppose:
- u11 is
a function of
x1
,x2
, and a constant - u13 is set to 0
- u14 is a function of
x3
and
a constant - u24 is a function of
z
and a
constant.
The list notation would be formulas =
list(y ~ x1 + x2, ~ 0, ~ x3, ~ z)
. The other method is
to use the Formula
syntax, with one
left-hand side and four right-hand sides (separated by
vertical bars). This notation would be formulas =
y ~ x1 + x2 | 0 | x3 | z
. To fix a utility at 0, just use 0
as its equation,
as in the example just given. To estimate only a
constant for a particular utility, use 1
as its
equation.
There are three equivalent ways to specify the outcome in
formulas
. One is to use a numeric vector with
three unique values, with their values (from lowest to
highest) corresponding with the terminal nodes of the
game tree illustrated above (from left to right). The
second is to use a factor, with the levels (in order as
given by levels(y)
) corresponding to the terminal
nodes. The final way is to use two indicator variables,
with the first standing for whether Player 1 moves L (0)
or R (1), the second standing for Player 2's choice if
Player 1 moves R. (The values of the second when Player
1 moves L should be set to 0 or 1, not
NA
, in order to ensure that observations are not
dropped from the data when na.action = na.omit
.)
The way to specify formulas
when using indicator
variables is, for example, y1 + y2 ~ x1 + x2 | 0 |
x3 | z
.
If fixedUtils
or sdformula
is specified,
the estimated parameters will include terms labeled
log(sigma)
(for probit links) or
log(lambda)
. These are the scale parameters of
the stochastic components of the players' utility. If
sdByPlayer
is FALSE
, then the variance of
error terms (or the equation describing it, if
sdformula
contains non-constant regressors) is
assumed to be common across all players. If
sdByPlayer
is TRUE
, then two variances (or
equations) are estimated: one for each player. For more
on the interpretation of the scale parameters in these
models and how it differs between the agent error and
private information models, see Signorino (2003).
The model is fit using maxLik
, using the
BFGS optimization method by default (see
maxBFGS
). Use the method
argument
to specify an alternative from among those supplied by
maxLik
.