logbin.smooth
fits log-link binomial
regression models using a stable CEM algorithm. It provides additional
flexibility over logbin
by allowing for smooth
semi-parametric terms.
logbin.smooth(formula, mono = NULL, data, subset, na.action, offset,
control = list(...), model = TRUE, model.logbin = FALSE,
method = c("cem", "em"), accelerate = c("em", "squarem", "pem", "qn"),
control.accelerate = list(), ...)
an object of class "formula"
(or one that can be coerced into that class): a symbolic
description of the model to be fitted. The details of
model specification are given under "Details". The model must contain an intercept
and at least one semi-parametric term, included by using the
B
or Iso
functions. Note that 2nd-order terms
(such as interactions) or above are not currently supported
(see logbin
).
a vector indicating which terms in
formula
should be restricted to have a
monotonically non-decreasing relationship with the
outcome. May be specified as names or indices of the
terms.
Iso()
terms are always monotonic.
an optional data frame, list or environment
(or object coercible by as.data.frame
to a
data frame) containing the variables in the model. If not
found in data
, the variables are taken from
environment(formula)
, typically the environment
from which logbin.smooth
is called.
an optional vector specifying a subset of observations to be used in the fitting process.
a function which indicates what should happen when the data
contain NA
s. The default is set be the na.action
setting of options
, and is na.fail
if that is unset. The `factory-fresh' default is na.omit
.
Another possible value is NULL
, no action. Value
na.exclude
can be useful.
this can be used to specify an a
priori known component to be included in the linear
predictor during fitting. This should be NULL
or a
non-positive numeric vector of length equal to the number of cases.
One or more offset
terms can be included in
the formula instead or as well, and if more than one is
specified their sum is used. See
model.offset
.
a list of parameters for controlling the
fitting process, passed to
logbin.control
.
a logical value indicating whether the model frame should be included as a component of the returned value.
a logical value indicating whether the fitted logbin
object should be included as a component of the returned value.
a character string that determines which EM-type algorithm to use
to find the MLE: "cem"
for the combinatorial EM algorithm,
which cycles through a sequence of constrained parameter spaces, or
"em"
for a single EM algorithm based on an overparameterised
model.
Unlike logbin
, methods "glm"
and "ab"
are
not available because they do not support the necessary monotonicity constraints.
a character string that determines the acceleration
algorithm to be used, (partially) matching one of "em"
(no acceleration -- the default),
"squarem"
, "pem"
or "qn"
. See turboem
for further details. Note that "decme"
is not permitted.
a list of control parameters for the acceleration algorithm. See turboem
for details of the parameters that apply to each algorithm. If not specified, the defaults are used.
arguments to be used to form the default
control
argument if it is not supplied directly.
An object of class "logbin.smooth"
, which contains the same objects as class
"logbin"
(the same as "glm"
), as well as:
if model.logbin
is TRUE
; the logbin
object
for the fully parametric model corresponding to the fitted model.
the minimum and maximum observed values for each of the smooth terms in the model, to help define the covariate space.
the component from interpret.logbin.smooth(formula)
that contains the formula
term with any additional arguments to the B
function removed.
a named list containing the knot vectors for each of the smooth terms in the model.
logbin.smooth
performs the same fitting process as logbin
,
providing a stable maximum likelihood estimation procedure for log-link
binomial GLMs, with the added flexibility of allowing semi-parametric
B
and Iso
terms (note that logbin.smooth
will stop with an
error if no semi-parametric terms are specified in the right-hand side of the formula
;
logbin
should be used instead).
The method partitions the parameter space associated with the semi-parametric part of the
model into a sequence of constrained parameter spaces, and defines a fully parametric
logbin
model for each. The model with the highest log-likelihood is the MLE for
the semi-parametric model (see Donoghoe and Marschner, 2015).
Donoghoe, M. W. and I. C. Marschner (2015). Flexible regression models for rate differences, risk differences and relative risks. International Journal of Biostatistics 11(1): 91--108.
Donoghoe, M. W. and I. C. Marschner (2018). logbin: An R package for relative risk regression using the log-binomial model. Journal of Statistical Software 86(9): 1--22.
Marschner, I. C. (2014). Combinatorial EM algorithms. Statistics and Computing 24(6): 921--940.
# NOT RUN {
## Simple example
x <- c(0.3, 0.2, 0.0, 0.1, 0.2, 0.1, 0.7, 0.2, 1.0, 0.9)
y <- c(5, 4, 6, 4, 7, 3, 6, 5, 9, 8)
system.time(m1 <- logbin.smooth(cbind(y, 10-y) ~ B(x, knot.range = 0:2), mono = 1, trace = 1))
## Compare with accelerated version
system.time(m1.acc <- update(m1, accelerate = "squarem"))
## Isotonic relationship
m2 <- logbin.smooth(cbind(y, 10-y) ~ Iso(x))
# }
# NOT RUN {
plot(m1)
plot(m2)
# }
# NOT RUN {
summary(predict(m1, type = "response"))
summary(predict(m2, type = "response"))
# }
Run the code above in your browser using DataLab