ppgam fits a Poisson process with intensity function \(\lambda({\bf x})\) for covariate
\({\bf x} = (x_1, \ldots, x_d)\). The likelihood for this model with events occurring at \({\bf x}_i\),
for \(i=1, \ldots, n\), is approximated by quadrature with
$$\exp\bigg[-\sum_{j=1}^m w_j \lambda({\bf x}_j^*)\bigg] \prod_{i=1}^n \lambda({\bf x}_i)$$
where \({\bf x}_j^*\) and \(w_j\) are quadrature nodes and weights, for \(j=1, \ldots, m\),
defined with nodes and weights.
formula gives the formula for the log intensity function of a Poisson process.
It is passed to gam. If formula has no response, i.e. ~ s(...),
then data is assumed to give the times at which events occur. Then nodes
is used to control integration of the intensity function. If formula has a response,
e.g. y ~ s(...), then y is assumed binary, comprising only zeros and ones.
Then data is assumed to give the state space of the Poisson process,
(e.g. daily time steps if occurrences of events are measured in days)
and ones in y identify when events occur. Note that
if formula has no response, data will have \(n\) rows, and \(m\) rows otherwise.
nodes is used to supply nodes for integrating the Poisson
process intensity function by quadrature. It is supplied as a list or data
frame.
If nodes is a list, its names must correspond to variables on
the r.h.s. of formula. Elements of the list, x, say, can be a vector
or 2-column matrix, where length(x) > 1 or nrow(x) > 1. If a
matrix, its first and second columns are taken as integration nodes and
weights, respectively. If a vector of length 2, it is assumed to give the
range of the nquad midpoints used as integration nodes. If a longer vector,
it is assumed to be the integration nodes, and nquad is ignored.
If nodes is a data frame, it is assumed to give the integration nodes.
nquad specifies the number of integration nodes per variable, unless
nodes are specified in nodes. If a single integer and
is.null(names(nquad)) it is used for all variables. Otherwise,
names are matched to variables. An error is returned if any variables do not
have values specified.
weights controls the quadrature weights. If nodes is a list,
a scalar multiplies any weights calculated alongside nodes, i.e. node separations.
If nodes is a data frame, weights can be a scalar that is repeated
nrow(nodes), or a vector of length nrow(nodes) that gives the weights
for each row of nodes.
approx controls quadrature details. Its first term controls the
integration method, which uses either midpoint ("midpoint", default),
Simpson's ("Simpson") or Gauss-Legendre ("Gauss") rules. The second
term of approx controls the integration range, which is either the
range of the variable ("exact"), or by calling pretty() ("pretty").
trace controls what is reported. Details of convergence are printed
with trace = 1, of nodes with trace = 2, and trace = 3
prints both.
weight.non.numeric applied to any non-numeric variables, and gives non-equal
quadrature weights to different nodes if TRUE. So nodes get weights according
to their frequency of occurrence in data. If nquad is invoked, only a
subset of the unique values of the non-numeric variable are used, which are the nquad
with largest weights.