dredge: Evaluate "all possible" models

Description

Automatically generate models with combinations of the terms in the global model, with optional restrictions.

Usage

dredge(global.model, beta = FALSE, eval = TRUE, rank = "AICc",
    fixed = NULL, m.max = NA, subset, marg.ex = NULL,
    trace = FALSE, ...)
## S3 method for class 'model.selection':
print(x, abbrev.names = TRUE, ...)

Arguments

global.model

a fitted global model object. Currently, it can be a lm, glm, rlm, polr, multinom, gam, gls, lme, lmer, cox

beta

logical, should standardized coefficients be returned?

eval

whether to evaluate and rank the models. If FALSE, a list of all possible model formulas is returned.

rank

optional custom rank function (information criterion) to be used instead AICc, e.g. QAIC or BIC. See Details.

fixed

optional, either a single sided formula or a character vector giving names of terms to be included in all models.

m.max

optional, maximum number of terms to be included in single model, defaults to the number of terms in global.model.

subset

logical expression to put constraints for the set of models. Can contain any of the global.model terms (use getAllTerms(global.model) to list them). Complex expressions (e.g smooth functions in

marg.ex

a character vector specifying names of variables for which NOT to check for marginality restrictions when generating model formulas. If this argument is set to TRUE, all model formulas are used (i.e. no checking). See Deta

trace

if TRUE, all calls to the fitting function (i.e. updated global.model calls) are printed.

a model.selection object, returned by dredge.

abbrev.names

Should variable names be abbreviated when printing? (useful with many variables).

...

optional arguments for the rank function. Any can be an expression (of mode call), in which case any x within it will be substituted with a current model.

Value

dredge returns an object of class model.selection, being a data.frame with models' coefficients (or TRUE/FALSE for factors), k, deviance/RSS, R-squared, AIC, AICc, delta and weight. This depends on a type of model. Models are ordered according to the used information criterion (lowest on top), specified by rank.
The attribute "calls" is a list containing the model calls used (arranged in the same order as the models).

encoding

utf-8

Details

Models are run one by one by repeated evaluation of the call to global.model with modified formula argument (or fixed in lme). This method, while robust in that it can be applied to a variety of different models is not very efficient and may be considerably time-intensive.

Note that the number of combinations grows exponentially with number of predictor variables (latex{$2^{N}$}{2^N}). Because there is potentially a large number of models to evaluate, to avoid memory overflow the fitted model objects are not stored. To get (a subset of) the models, use get.models with the object returned by dredge as an argument.

Handling interactions, dredge respects marginality constraints, so all possible combinations do not include models containing interactions without their respective main effects. This behaviour can be altered by marg.ex argument. It can be used to allow for simple nested designs. For example, with global model of form a / (x + z), use marg.ex = "a" and fixed = "a".

rank is found by a call to match.fun and may be specified as a function or a symbol (e.g. a back-quoted name) or a character string specifying a function to be searched for from the environment of the call to dredge.

Function rank must be able to accept model as a first argument and must always return a scalar. Typical choice for rank would be "AIC", "QAIC" or "BIC" (stats or nlme).

Use of na.action = na.omit (R's default) in global.model should be avoided, as it results with sub-models fitted to different data sets, if there are missing values. In versions >= 0.13.17 a warning is given in such a case.

Examples

Run this code

# Example from Burnham and Anderson (2002), page 100:
data(Cement)
lm1 <- lm(y ~ ., data = Cement)
dd <- dredge(lm1)
subset(dd, delta < 4)

#models with delta.aicc < 4
model.avg(get.models(dd, subset = delta < 4)) # get averaged coefficients

#or as a 95\% confidence set:
top.models <- get.models(dd, cumsum(weight) <= .95)

model.avg(top.models) # get averaged coefficients

#topmost model:
top.models[[1]]

# Examples of using 'subset':
# exclude models containing both X1 and X2
dredge(lm1, subset = !(X1 & X2))
# keep only models containing X3
dredge(lm1, subset = X3)
# the same, but more effective:
dredge(lm1, fixed = "X3")

#Reduce the number of generated models, by including only those with
# up to 2 terms (and intercept)
dredge(lm1, m.max = 2)

Run the code above in your browser using DataLab