addreg.smooth
fits additive (identity-link) Poisson, negative binomial
and binomial regression models using a stable EM algorithm. It provides additional
flexibility over addreg
by allowing for semi-parametric
terms.
addreg.smooth(formula, mono = NULL, family, data, standard, subset,
na.action, offset, control = list(...), model = TRUE,
model.addreg = FALSE, method = c("cem", "em"),
accelerate = c("em", "squarem", "pem", "qn"),
control.method = list(), ...)
an object of class "formula"
(or one that can be coerced into that class): a symbolic
description of the model to be fitted. The details of
model specification are given under "Details". The model
must contain an intercept and at least one semi-parametric
term, included by using the B
or Iso
functions. Note that 2nd-order terms (such as interactions) or above
are not currently supported (see addreg
).
a vector indicating which terms in
formula
should be restricted to have a
monotonically non-decreasing relationship with the
outcome. May be specified as names or indices of the
terms.
Iso()
terms are always monotonic.
a description of the error distribution to
be used in the model. This can be a character string
naming a family function, a family function or the result
of a call to a family function (see
family
for details of family
functions), but here it is restricted to be poisson
,
negbin1
or binomial
family with identity
link.
an optional data frame, list or environment
(or object coercible by as.data.frame
to a
data frame) containing the variables in the model. If not
found in data
, the variables are taken from
environment(formula)
, typically the environment
from which addreg.smooth
is called.
a numeric vector of length equal to the number of cases, where each element is a positive constant that (multiplicatively) standardises the fitted value of the corresponding element of the response vector. Ignored for binomial family (the two-column specification of response should be used instead).
an optional vector specifying a subset of observations to be used in the fitting process.
a function which indicates what should happen when the data
contain NA
s. The default is set be the na.action
setting of options
, and is na.fail
if that is unset. The `factory-fresh' default is na.omit
.
Another possible value is NULL
, no action. Value
na.exclude
can be useful.
this can be used to specify an a
priori known component to be included in the linear
predictor during fitting. This should be NULL
or a
non-negative numeric vector of length equal to the number of cases.
One or more offset
terms can be included in
the formula instead or as well, and if more than one is
specified their sum is used. See
model.offset
.
Ignored for binomial family; not yet implemented for negative binomial models.
list of parameters for controlling the
fitting process, passed to
addreg.control
.
a logical value indicating whether the model frame (and, for binomial models, the equivalent Poisson model) should be included as a component of the returned value.
a logical value indicating whether the fitted addreg
object should be included as a component of the returned value.
a character string that determines which EM-type algorithm to use
to find the MLE: "cem"
for the combinatorial EM algorithm,
which cycles through a sequence of constrained parameter spaces, or
"em"
for a single EM algorithm based on an overparameterised
model.
a character string that determines the acceleration
algorithm to be used, (partially) matching one of "em"
(no acceleration --- the default),
"squarem"
, "pem"
or "qn"
. See turboem
for further details. Note that "decme"
is not permitted.
a list of control parameters for the acceleration algorithm, which are passed to
the control.method
argument of turboem
.
If any items are not specified, the defaults are used.
arguments to be used to form the default
control
argument if it is not supplied directly.
An object of class "addreg.smooth"
, which contains the same objects as class
"addreg"
(the same as "glm"
objects, without contrasts
,
qr
, R
or effects
components), as well as:
if model.addreg
is TRUE
; the addreg
object
for the fully parametric model corresponding to the fitted model.
the minimum and maximum observed values for each of the smooth terms in the model, to help define the covariate space.
the component from interpret.addreg.smooth(formula)
that contains the formula
term with any additional arguments to the B
function removed.
a named list containing the knot vectors for each of the smooth terms in the model.
addreg.smooth
performs the same fitting process as addreg
,
providing a stable maximum likelihood estimation procedure for identity-link
Poisson, negative binomial or binomial models, with the added flexibility of allowing semi-parametric
B
and Iso
terms (note that addreg.smooth
will stop with an
error if no semi-parametric terms are specified in the right-hand side of the formula
;
addreg
should be used instead).
The method partitions the parameter space associated with the semi-parametric part of the
model into a sequence of constrained parameter spaces, and defines a fully parametric
addreg
model for each. The model with the highest log-likelihood is the MLE for
the semi-parametric model (see Donoghoe and Marschner, 2015).
Acceleration of the EM algorithm can be achieved through the
methods of the turboEM package, specified
through the accelerate
argument. However, note that these
methods do not have the guaranteed convergence of the standard
EM algorithm, particularly when the MLE is on the boundary of
its (possibly constrained) parameter space.
Donoghoe, M. W. and I. C. Marschner (2015). Flexible regression models for rate differences, risk differences and relative risks. International Journal of Biostatistics 11(1): 91--108.
Marschner, I. C. (2014). Combinatorial EM algorithms. Statistics and Computing 24(6): 921--940.
# NOT RUN {
## Simple example
dat <- data.frame(x1 = c(3.2,3.3,3.4,7.9,3.8,0.7,2.0,5.4,8.4,3.0,1.8,5.6,5.5,9.0,8.2),
x2 = c(1,0,0,1,0,1,0,0,0,0,1,0,1,1,0),
n = c(6,7,5,9,10,7,9,6,6,7,7,8,6,8,10),
y = c(2,1,2,6,3,1,2,2,4,4,1,2,5,7,7))
m1 <- addreg.smooth(cbind(y, n-y) ~ B(x1, knot.range = 1:3) + factor(x2), mono = 1,
data = dat, family = binomial, trace = 1)
# }
# NOT RUN {
plot(m1, at = data.frame(x2 = 0:1))
points(dat$x1, dat$y / dat$n, col = rainbow(2)[dat$x2 + 1], pch = 20)
# }
Run the code above in your browser using DataLab