Models for gnm
are specified by giving a symbolic description
of the nonlinear predictor, of the form response ~ terms
. The
response
is typically a numeric vector, see later in this
section for alternatives. The usual symbolic language may be used to
specify any linear terms, see formula
for details. gnm
has the in-built capability to handle multiplicative
interactions, which can be specified in the model formula using the
symbolic wrapper Mult
; e.g. Mult(A, B)
specifies a
multiplicative interaction between factors A
and
B
. The family of multiplicative interaction models include
row-column association models for contingency tables (e.g., Agresti,
2002, Sec 9.6), log-multiplicative or UNIDIFF models (Erikson and
Goldthorpe, 1992; Xie, 1992), and GAMMI models (van Eeuwijk, 1995).
Other nonlinear terms may be incorporated in the model via
plug-in functions that provide the objects required by gnm
to
fit the desired term. Such terms are specified in the model formula
using the symbolic wrapper Nonlin
;
e.g. Nonlin(PlugInFunction(A, B))
specifies a term to be fitted
by the plug-in function PlugInFunction
involving factors
A
and B
. The gnm package includes plug-in
functions for multiplicative interactions with homogeneous effects
(MultHomog
) and diagonal reference terms (Dref
). Users
may also define their own plug-in functions, see Nonlin
for details.
The eliminate
argument may be used to specify a factor that
is to be included in the model, but excluded from print()
displays of the model object or its components obtained using accessor
functions such as coef()
etc. The eliminate
'd factor is
included as the first term in the model (since an intercept is then
redundant, none is fitted). The structure of the factor is exploited
to improve computational efficiency --- substantially so if the number
of eliminated parameters is large. Use of eliminate
is designed
for terms that are required in the model but are not of direct
interest (e.g., terms needed to fit multinomial-response models as
conditional Poisson models). See backPain
for an example.
For contingency tables, the data may be provided as an object of class
"table"
from which the frequencies will be extracted to use
as the response. In this case, the response should be specified as
Freq
in the model formula. The "predictors"
,
"fitted.values"
, "residuals"
, "prior.weights"
,
"weights"
, "y"
and "offset"
components of
the returned gnm
fit will be tables with the same format as the
data, completed with NA
s where necessary.
For binomial models, the response
may be specified as a factor
in which the first level denotes failure and all other levels denote
success, as a two-column matrix with the columns giving the numbers
of successes and failures, or as a vector of the proportions of
successes.
The gnm
fitting algorithm consists of two stages. In the start-up
iterations, any nonlinear parameters that are not specified by either the
start
argument of gnm
or a plug-in function are
updated one parameter at a time, then the linear parameters are
jointly updated before the next iteration. In the main iterations, all
the parameters are jointly updated, until convergence is reached or
the number or iterations reaches iterMax
. The
lsMethod
argument specifies
what numerical method is to be used to solve the
(typically rank-deficient) least squares problem at the heart of the
gnm
fitting algorithm: the options are
direct solution using a QR decomposition ("qr"
), and matrix
inversion via Cholesky decomposition ("chol"
). In both cases,
the design matrix is standardized and regularized (in the
Levenberg-Marquardt sense) prior to solving. If lsMethod
is
left unspecified, the default is "qr"
, unless eliminate
is used in which case the default lsMethod
used is "chol"
.
Convergence is judged by comparing the squared components of the score vector
with corresponding elements of the diagonal of the Fisher information
matrix. If, for all components of the score vector, the ratio is less
than tolerance^2
, or the corresponding diagonal element of the
Fisher information matrix is less than 1e-20, iterations cease.
By default, gnm
uses an over-parameterized representation of
the model that is being fitted. Only minimal identifiability constraints
are imposed, so that in general a random parameterization is obtained.
The parameter estimates are ordered so that those for any linear terms
appear first.
getContrasts
may be used to obtain estimates of specified
contrasts, if these contrasts are identifiable. In particular,
getContrasts
may be used to estimate the contrasts between the
first level of a factor and the rest, and obtain standard errors.
If appropriate constraints are known in advance, or have been
determined from a gnm
fit, the model may be (re-)fitted using
the constrain
argument to specify coefficients which should be
set to zero. Constraints should only be specified for non-eliminated
parameters. update
provides a convenient way of re-fitting a
gnm
model with new constraints.