Generate a set of models with combinations (subsets) of terms in the global model, with optional rules for model inclusion.

```
dredge(global.model, beta = c("none", "sd", "partial.sd"), evaluate = TRUE,
rank = "AICc", fixed = NULL, m.lim = NULL, m.min, m.max, subset,
trace = FALSE, varying, extra, ct.args = NULL, ...)
```# S3 method for model.selection
print(x, abbrev.names = TRUE, warnings = getOption("warn") != -1L, ...)

global.model

a fitted ‘global’ model object. See ‘Details’ for a list of supported types.

beta

indicates whether and how the coefficients estimates should be
standardized, and must be one of `"none"`

, `"sd"`

or
`"partial.sd"`

. You can specify just the initial letter. `"none"`

corresponds to unstandardized coefficients, `"sd"`

and
`"partial.sd"`

to coefficients standardized by SD and Partial
SD, respectively. For backwards compatibility, logical value is
also accepted, `TRUE`

is equivalent to `"sd"`

and `FALSE`

to
`"none"`

. See `std.coef`

.

evaluate

whether to evaluate and rank the models. If `FALSE`

, a
list of unevaluated `call`

s is returned.

rank

optional custom rank function (returning an information
criterion) to be used instead `AICc`

, e.g. `AIC`

, `QAIC`

or
`BIC`

.
See ‘Details’.

fixed

optional, either a single sided formula or a character vector giving names of terms to be included in all models. See ‘Subsetting’.

m.lim, m.max, m.min

optionally, the limits `c(lower, upper)`

for number of terms in a single model (excluding the intercept). An
`NA`

means no limit. See ‘Subsetting’.
Specifying limits as `m.min`

and `m.max`

is allowed for backward
compatibility.

subset

logical expression describing models to keep in the resulting set. See ‘Subsetting’.

trace

if `TRUE`

or `1`

, all calls to the fitting function
are printed before actual fitting takes place. If `trace > 1`

, a progress bar
is displayed.

varying

optionally, a named list describing the additional arguments
to vary between the generated models. Item names correspond to the
arguments, and each item provides a list of choices (i.e. ```
list(arg1 =
list(choice1, choice2, ...), ...)
```

). Complex elements in the choice list
(such as `family`

objects) should be either named (uniquely) or quoted
(unevaluated, e.g. using `alist`

, see `quote`

),
otherwise the result may be visually unpleasant. See example in
`Beetle`

.

extra

optional additional statistics to include in the result,
provided as functions, function names or a list of such (best if named
or quoted). Similarly as in `rank`

argument, each function must accept
fitted model object as an argument and return (a value coercible to) a
numeric vector.
These can be e.g. additional information criterions or goodness-of-fit
statistics. The character strings `"R^2"`

and `"adjR^2"`

are
treated in a special way, and will add a likelihood-ratio based R<U+00B2> and
modified-R<U+00B2> respectively to the result (this is more efficient than using
`r.squaredLR`

directly).

x

a `model.selection`

object, returned by `dredge`

.

abbrev.names

should printed term names be abbreviated? (useful with complex models).

warnings

if `TRUE`

, errors and warnings issued during the model
fitting are printed below the table (only with `pdredge`

).
To permanently remove the warnings, set the object's attribute
`"warnings"`

to `NULL`

.

ct.args

optional list of arguments to be passed to
`coefTable`

(e.g. `dispersion`

parameter for `glm`

affecting standard errors used in subsequent
`model averaging`

).

…

optional arguments for the `rank`

function. Any can be
an unevaluatec expression, in which case any `x`

within it will be substituted
with a current model.

An object of class `c("model.selection", "data.frame")`

, being a
`data.frame`

, where each row represents one model.
See `model.selection.object`

for its structure.

Models are fitted through repeated evaluation of modified call extracted from
the `global.model`

(in a similar fashion as with `update`

). This
approach, while robust in that it can be applied to most model types is not
the most efficient and may be computationally-intensive.

Note that the number of combinations grows exponentially with number of predictors (2<U+207F>, less when interactions are present, see below).

The fitted model objects are not stored in the result. To get (a subset of)
models, use `get.models`

on the object returned by `dredge`

.

For a list of model types that can be used as a `global.model`

see
list of supported models.
Modelling functions not storing `call`

in their result should be evaluated
*via* the wrapper function created by `updateable`

.

`rank`

is found by a call to `match.fun`

and may be specified as a
function or a symbol or a character string specifying
a function to be searched for from the environment of the call to `dredge`

.
The function `rank`

must accept model object as its first argument and
always return a scalar.

By default, marginality constraints are respected, so “all possible
combinations” include only those containing interactions with their
respective main effects and all lower order terms.
However, if `global.model`

makes an exception to this principle (e.g. due
to a nested design such as `a / (b + d)`

), this will be reflected in the
subset models.

There are three ways to constrain the resulting set of models: setting limits to
the number of terms in a model with `m.lim`

, binding
term(s) to all models with `fixed`

, and more complex rules can be applied
using argument `subset`

. To be included in the selection table, the model
formulation must satisfy all these conditions.

`subset`

can take either a form of an *expression* or a *matrix*.
The latter should be a lower triangular matrix with logical values, where
columns and rows correspond to `global.model`

terms. Value
`subset["a", "b"] == FALSE`

will exclude any model containing both terms
`a` and `b`.
`demo(dredge.subset)`

has examples of using the `subset`

matrix in
conjunction with correlation matrices to exclude models containing collinear
predictors.

In the form of `expression`

, the argument `subset`

acts in a similar
fashion to that in the function `subset`

for `data.frames`

: model
terms can be referred to by name as variables in the expression, with the
difference being that are interpreted as logical values (i.e. equal to
`TRUE`

if the term exists in the model).

There is also `.(x)`

and `.(+x)`

notation indicating, respectively,
any and all interactions including a *term* `x`

. It is only useful
with marginality exceptions.

The expression can contain any of the `global.model`

terms
(`getAllTerms(global.model)`

lists them), as well as names of the
`varying`

argument items. Names of `global.model`

terms take
precedence when identical to names of `varying`

, so to avoid ambiguity
`varying`

variables in `subset`

expression should be enclosed in
`V()`

(e.g. `subset = V(family) == "Gamma"`

assuming that
`varying`

is something like `list(family = c(..., "Gamma"))`

).

If item names in `varying`

are missing, the items themselves are coerced to
names. Call and symbol elements are represented as character values (via
`deparse`

), and everything except numeric, logical, character and
`NULL`

values is replaced by item numbers (e.g. `varying =`

`list(family =`

`list(..., Gamma)`

should be referred to as
`subset = V(family) == 2`

. This can quickly become confusing, therefore it
is recommended to use named lists. `demo(dredge.varying)`

provides examples.

The `subset`

expression can also contain variable
``*nvar*``

(backtick-quoted), equal to
number of terms in the model (**not** the number of estimated parameters).

To make inclusion of a model term conditional on presence of another model term,
the function `dc`

(“**d**ependency **c**hain”) can be used in
the `subset`

expression. `dc`

takes any number of term names as
arguments, and allows a term to be included only if all preceding ones
are also present (e.g. `subset = dc(a, b, c)`

allows for models `a`

,
`a+b`

and `a+b+c`

but not `b`

, `c`

, `b+c`

or
`a+c`

).

`subset`

expression can have a form of an unevaluated `call`

,
`expression`

object, or a one sided `formula`

. See ‘Examples’.

Compound model terms (such as interactions, ‘as-is’ expressions within
`I()`

or smooths in `gam`

) should be enclosed within curly brackets
(e.g. `{s(x,k=2)}`

), or backticks (like non-syntactic
names, e.g.
``s(x, k = 2)``

).
Backticks-quoted names must match exactly (including whitespace) the term names
as given by `getAllTerms`

.

`subset`

expression syntax summary`a & b`

indicates that model terms

`a`and`b`must be present (see Logical Operators)`{log(x,2)}`

or```

`log(x, 2)`

```

represent a complex model term

`log(x, 2)`

`V(x)`

represents a

`varying`

variable`x``.(x)`

indicates that at least one term containing the term

`x`must be present`.(+x)`

indicates that all the terms containing the term

`x`must be present`dc(a, b, c,...)`

‘dependency chain’:

`b`is allowed only if`a`is present, and`c`only if both`a`and`b`are present, etc.``*nvar*``

number of terms.

To simply keep certain terms in all models, use of argument `fixed`

is much
more efficient. The `fixed`

formula is interpreted in the same manner
as model formula and so the terms need not to be quoted.

Use of `na.action = "na.omit"`

(R's default) or `"na.exclude"`

in
`global.model`

must be avoided, as it results with sub-models fitted to
different data sets, if there are missing values. Error is thrown if it is
detected.

It is a common mistake to give `na.action`

as an argument in the call
to `dredge`

(typically resulting in an error from the `rank`

function to which the argument is passed through ‘…’), while the correct way
is either to pass `na.action`

in the call to the global model or to set
it as a global option.

There are `subset`

and
`plot`

methods, the latter creates a
graphical representation of model weights and variable relative importance.
Coefficients can be extracted with `coef`

or `coefTable`

.

`pdredge`

is a parallelized version of this function (uses a
cluster).

`get.models`

, `model.avg`

. `model.sel`

for
manual model selection tables.

Possible alternatives: `glmulti`

in package glmulti
and `bestglm`

(bestglm).
`regsubsets`

in package leaps also performs all-subsets
regression.

*Lasso* variable selection provided by various packages, e.g. glmnet,
lars or glmmLasso.

# NOT RUN { # Example from Burnham and Anderson (2002), page 100: # prevent fitting sub-models to different datasets # } # NOT RUN { options(na.action = "na.fail") fm1 <- lm(y ~ ., data = Cement) dd <- dredge(fm1) subset(dd, delta < 4) # Visualize the model selection table: # } # NOT RUN { par(mar = c(3,5,6,4)) plot(dd, labAsExpr = TRUE) # } # NOT RUN { # Model average models with delta AICc < 4 model.avg(dd, subset = delta < 4) #or as a 95% confidence set: model.avg(dd, subset = cumsum(weight) <= .95) # get averaged coefficients #'Best' model summary(get.models(dd, 1)[[1]]) # } # NOT RUN { # Examples of using 'subset': # keep only models containing X3 dredge(fm1, subset = ~ X3) # subset as a formula dredge(fm1, subset = expression(X3)) # subset as expression object # the same, but more effective: dredge(fm1, fixed = "X3") # exclude models containing both X1 and X2 at the same time dredge(fm1, subset = !(X1 && X2)) # Fit only models containing either X3 or X4 (but not both); # include X3 only if X2 is present, and X2 only if X1 is present. dredge(fm1, subset = dc(X1, X2, X3) && xor(X3, X4)) # the same as above, without "dc" dredge(fm1, subset = (X1 | !X2) && (X2 | !X3) && xor(X3, X4)) # Include only models with up to 2 terms (and intercept) dredge(fm1, m.lim = c(0, 2)) # } # NOT RUN { # Add R^2 and F-statistics, use the 'extra' argument dredge(fm1, m.lim = c(NA, 1), extra = c("R^2", F = function(x) summary(x)$fstatistic[[1]])) # with summary statistics: dredge(fm1, m.lim = c(NA, 1), extra = list( "R^2", "*" = function(x) { s <- summary(x) c(Rsq = s$r.squared, adjRsq = s$adj.r.squared, F = s$fstatistic[[1]]) }) ) # Add other information criterions (but rank with AICc): dredge(fm1, m.lim = c(NA, 1), extra = alist(AIC, BIC, ICOMP, Cp)) # }