This help page describes in detail technical points related to the definition of
models in chouca. If this is your first time working with chouca,
you may like the longer introduction in the vignette, accessible using
vignette("chouca-package").
camodel allows defining a stochastic cellular automaton model by its set of
transition rules. These are defined by a set of calls to the transition()
function. Each of these calls defines the two states of the transition, and the
probability as a one-sided formula involving constants and the special vectors
p and q.
transition() calls takes three arguments: the state from which the transition
is defined, the state to which the transition goes, and a transition probability,
defined as a one-sided formula. This formula can include numerical constants,
parameters defined in the named list parms, and any combination of
p['a'] and q['b'], which respectively represent the proportion
of cells in a landscape in state 'a', and the proportion of neighbors of a given cell
in state 'b' ('a', and 'b' being, of course, any of the possible states defined in the
model). Such formula could typically look like ~ 0.2 + 0.3 * p["a"] + q["b"].
See below for examples of model definitions.
It is important to remember when using this function that chouca only
supports models where the probabilities depend on constant parameters, the global
proportion of each state in the landscape, and the local proportion of cells around
a given cell. In other words, all transition probabilities should have the following
functional form:
$$a_0 + \sum_{k=1}^S g_k(q_k) + s(q, q) + s(p, q) + s(q, q)$$
where \(a_0\) is a constant, \(g_k\) are univariate functions of \(q_k\),
the proportions of neighbors of a cell in state k, and \(q\) is the vector
containing all the \(q_k\) for k between \(1\) and \(S\), the total number of
states in the model. Similarly, \(p\) is the length-\(S\) vector containing the
proportion of cells in each state in the whole grid. \(s\) above is the sum, defined
for two vectors \(x = (x_1, ..., x_S)\) and \(y = (y_1, ..., y_S)\) as
$$
a_1 x_1^{\alpha_1} y_1^{\beta_1} +
a_2 x_1^{\alpha_2} y_2^{\beta_2} +
a_3 x_1^{\alpha_3} y_3^{\beta_3} +
a_4 x_2^{\alpha_3} y_1^{\beta_3} +
a_4 x_2^{\alpha_3} y_2^{\beta_3} +
\dots +
a_K x_S^{\alpha_K} y_S^{\beta_K}
$$
where the \(a_k\), \(\alpha_k\) and \(\beta_k\) are constants for all \(k\),
and \(K\) is the total number of terms (equal to \(S^2\)). Note that
\(\alpha_K\) and \(\beta_K\) are capped to 5. This can be overriden using
options(chouca.degmax = n), but we do not recommend changing it as higher
values typically make the package slow and/or leads to numerical instabilities.
The functions \(g_k\) above can be any univariate functions of \(q_k\), so
chouca effectively supports any type of transition rule involving the
neighborhood of a cell, including some 'threshold' rules that involve a single state
(and only one). For example, a rule such as "more than
5 neighbors in a given state make a cell switch from state A to B" is OK, but
combining states may not be supported, such as "more than 5
neighbors in state A *and* 2 in state B means a cell switches from A to B". When in
doubt, just write your model, and chouca will tell you if it cannot run it
accurately by running model checks.
Model checks are controlled by the argument check_model. When set to "quick"
or "full", a check is performed to make sure the functional form above is able to
accurately represent probabilities of transitions in the model, with "full" enabling
more extensive testing, and "none" removing it entirely. Coefficients in the formula
above are rounded down to zero when below epsilon. This may be an issue if
your transition probabilities are close to zero: consider reducing epsilon to
a smaller value in this case, or adjusting your model parameters.
When space does not wrap around (wrap = FALSE), cells in the corners
or in the edges will have a lower number of neighbors. The proportions
of cells in a given state \(k\), \(q_k\), will thus be computed with a reduced
number of cells. For example, a cell in a corner will have only 2 neighbors
when using a 4x4 neighborhood, so \(q_k\) is computed using only two cells, and
can be only equal to 0, 0.5 or 1.
To run a model once it is defined, the function run_camodel can be
used, or run_meanfield for a mean-field approximation. An initial
landscape for a simulation can be created using generate_initmat.
You can update a model definition with new parameters (all of them or a subset)
using the update method. The model graph with the
different states and transitions can be displayed using the plot method
(this requires the package igraph).