The typical state space for softball involves 25 states
defined by the base situation (runners on base) and number of outs. The
standard base situations are: (1) bases empty, (2) runner on first, (3) runner
on second, (4) runner on third, (5) runners on first and second, (6) runners
on second and third, (7) runners on first and third, and (8) bases loaded.
These 8 states are crossed with each of three out states (0 outs, 1 out, or
2 outs) to form 24 states. The final 25th state is the 3 outs that marks
the end of an inning.
We expand these 25 states to incorporate "fast" players. We make the following
assumptions concerning fast players:
If a fast player is on first and the batter hits a single, the fast
player will stretch to third base (leaving the batter on first).
If a fast player is on second and the batter hits a single, the fast
player will stretch home (leaving the batter on first and a single run scored).
If a fast player is on first and the batter hits a double, the fast
player will stretch home (leaving the batter on second base and a single run scored).
A typical player (not fast) who successfully steals a base will become
a fast player for the remainder of that inning (meaning that a player
who successfully steals second base will stretch home on a single).
Based on these assumptions, we add base situations that designate runners on first
and second base as either typical runners (R) or fast runners (F). The entirety
of these base situations can be viewed using plot.chain with fast = TRUE.
Aside from these fast player assumptions, runners advance bases as expected (a single
advances each runner one base, a double advances each runner two bases, etc.).
Each at bat results in a change to the base situation and/or the number of outs. The
outcomes of an at-bat are limited to:
batter out (O): base state does not change, outs increase by one
single (S): runners advance accordingly, score may increase, outs do not change
double (D): runners advance accordingly, score may increase, outs do not change
triple (TR): runners advance accordingly, score may increase, outs do not change
homerun (HR): bases cleared, score increases accordingly, outs do not change
walk (W): runners advance accordingly, score may increase, outs do not change
The transitions resulting from these outcomes are stored in "transition matrices." We
utilize separate transition matrices for typical batters and fast batters (in order to
keep fast runners designated separately). We additionally incorporate stolen bases.
Steals are handled separately than the six at-bat outcomes because they do not result
in changes to the batter. Following softball norms, we only entertain steals of second
base. Steals are considered in cases when there is a runner on first and no runner on second.
In this situation, steal possibilities are limited to:
no steal attempt: base situation and outs do not change
successful steal: runner advances to second base
caught steal: runner is removed, outs increase by one
Steal possibilities are implemented in separate transition matrices. All transition
matrices are stored as internal RData files.
The stats input must be a data frame containing player probabilities. It must
contain columns "O", "S", "D", "TR", "HR", and "W" whose entries are probabilities summing
to one, corresponding to the probability of a player's at-bat resulting in each outcome.
The data frame must contain either a "NAME" or "NUMBER" column to identify players (these
must correspond to the lineup). Extra rows for players not in the lineup will be ignored.
This data frame may be generated from player statistics using prob_calc.
The stats data frame may optionally include an "SBA" (stolen base attempt) column
that provides the probability a given player will attempt a steal (provided they are on first
base with no runner on second). If "SBA" is specified, the data frame must also include
a "SB" (stolen base) column that provides the probability of a given player successfully
stealing a base (conditional on them attempting a steal). If these probabilities are not
specified, calculations will not involve any steals.
The stats data frame may also include a logical "FAST" column that indicates
whether a player is fast. If this column is not specified, the "FAST" designation
will be assigned based on each player's "SBA" probability. Generally, players who are more
likely to attempt steals are the fast players.
The cycle parameter is a useful tool for evaluating an entire lineup. Through the course
of a game, any of the nine players may lead-off an inning. A weighted or un-weighted average
of these nine expected scores provides a more holistic representation of the lineup than
the expected score based on a single lead-off.