Find all hierarchical models of specified GLM with information criterion (AIC, BIC, or AICc) within specified cutoff of minimum value. Alternatively, all such graphical models. Use branch and bound algorithm so we do not have to fit all models.
glmbb(big, little = ~ 1, family = poisson, data,
criterion = c("AIC", "AICc", "BIC"), cutoff = 10,
trace = FALSE, graphical = FALSE, …)a formula specifying the smallest model to be considered.
The response may be omitted and if not omitted is ignored (the response
is taken from big). Default is ~ 1. Model specified must
be nested within the model specified by big.
a description of the error distribution and link
function to be used in the model. This can be a
character string naming a family function, a family function or the
result of a call to a family function. (See family
for details of family functions.)
an optional data frame, list or environment (or object
coercible by as.data.frame to a data frame)
containing the variables in the models. If not found in data, the
variables are taken from environment(big),
typically the environment from which glmbb is called.
a character string specifying the information criterion,
must be one of "AIC" (Akaike Information Criterion, the default),
"BIC" (Bayes Information Criterion) or "AICc" (AIC corrected
for sample size).
a nonnegative real number. This function finds all
hierarchical models that are submodels of big and supermodels of
little with information criterion less than or equal to the
cutoff plus the minimum information criterion over all these models.
logical. Emit debug info if TRUE.
logical. If TRUE search only over graphical models
rather than hierarchical models.
additional named or unnamed arguments to be passed
to statsglm.
An object of class "glmbb" containing at least the following
components:
the model frame, a data frame containing all the variables.
the argument little.
the argument big.
the argument criterion.
the argument cutoff.
an R environment object containing all of the fits done.
the minimum value of the criterion.
the argument graphical.
Typical value for big is something like foo ~ bar * baz * qux
where foo is the response variable (or matrix when family is
binomial or quasibinomial,
see glm) and bar, baz, and qux
are all the predictors that are considered for inclusion in models.
A model is hierarchical if it includes all lower-order interactions for each
term. This is automatically what formulas with all variables connected by
stars (*) do, like the example above.
But other specifications are possible.
For example, foo ~ (bar + baz + qux)^2 specifies the model with all
main effects, and all two-way interactions, but no three-way interaction,
and this is hierarchical.
A model \(m_1\) is nested within a model \(m_1\) if all terms
in \(m_1\) are also terms in \(m_2\). The default little model
~ 1 is nested within every model except those specified to have
no intercept by 0 + or some such (see link[stats]{formula}).
The interaction graph of a model is the undirected graph whose node set is
the predictor variables in the model and whose edge set has one edge for each
pair of variables that are in an interaction term. A clique in a graph is
a maximal complete subgraph. A model is graphical if it is hierarchical
and has an interaction term for the variables in each clique.
When graphical = TRUE only graphical models are considered.
Hand, D. J. (1981) Branch and bound in statistical data analysis. The Statistician, 30, 1--13.
link[stats]{family},
link[stats]{formula},
link[stats]{glm},
isGraphical,
isHierarchical
data(crabs)
gout <- glmbb(satell ~ (color + spine + width + weight)^3,
criterion = "BIC", data = crabs)
summary(gout)
Run the code above in your browser using DataLab