Generate a model selection table of models with combinations (subsets) of fixed effect terms in the global model, with optional model inclusion rules.

```
dredge(global.model, beta = c("none", "sd", "partial.sd"), evaluate = TRUE,
rank = "AICc", fixed = NULL, m.lim = NULL, m.min, m.max, subset,
trace = FALSE, varying, extra, ct.args = NULL, deps = attr(allTerms0, "deps"),
cluster = NULL,
...)
```# S3 method for model.selection
print(x, abbrev.names = TRUE, warnings = getOption("warn") != -1L, ...)

An object of class `c("model.selection", "data.frame")`

, being a
`data.frame`

, where each row represents one model.
See `model.selection.object`

for its structure.

- global.model
a fitted ‘global’ model object. See ‘Details’ for a list of supported types.

- beta
indicates whether and how the coefficients are standardized, and must be one of

`"none"`

,`"sd"`

or`"partial.sd"`

. You can specify just the initial letter.`"none"`

corresponds to unstandardized coefficients,`"sd"`

and`"partial.sd"`

to coefficients standardized by SD and Partial SD, respectively. For backwards compatibility, logical value is also accepted,`TRUE`

is equivalent to`"sd"`

and`FALSE`

to`"none"`

. See`std.coef`

.- evaluate
whether to evaluate and rank the models. If

`FALSE`

, a list of unevaluated`call`

s is returned.- rank
optionally, the rank function returning a sort of an information criterion, to be used instead

`AICc`

, e.g.`AIC`

,`QAIC`

or`BIC`

. See ‘Details’.- fixed
optional, either a single-sided formula or a character vector giving names of terms to be included in all models. See ‘Subsetting’.

- m.lim, m.max, m.min
optionally, the limits

`c(lower, upper)`

for the number of terms in a single model (excluding the intercept). An`NA`

means no limit. See ‘Subsetting’. Specifying limits as`m.min`

and`m.max`

is allowed for backward compatibility.- subset
logical expression describing models to keep in the resulting set. See ‘Subsetting’.

- trace
if

`TRUE`

or`1`

, all calls to the fitting function are printed before actual fitting takes place. If`trace > 1`

, a progress bar is displayed.- varying
optionally, a named list describing the additional arguments to vary between the generated models. Item names correspond to the arguments, and each item provides a list of choices (i.e.

`list(arg1 = list(choice1, choice2, ...), ...)`

). Complex elements in the choice list (such as`family`

objects) should be either named (uniquely) or quoted (unevaluated, e.g. using`alist`

, see`quote`

), otherwise the result may be visually unpleasant. See example in`Beetle`

.- extra
optional additional statistics to be included in the result, provided as functions, function names or a list of such (preferably named or quoted). As with the

`rank`

argument, each function must accept as an argument a fitted model object and return (a value coercible to) a numeric vector. This could be, for instance, additional information criteria or goodness-of-fit statistics. The character strings`"R^2"`

and`"adjR^2"`

are treated in a special way and add a likelihood-ratio based R² and modified-R² to the result, respectively (this is more efficient than using`r.squaredLR`

directly).- x
a

`model.selection`

object, returned by`dredge`

.- abbrev.names
should printed term names be abbreviated? (useful with complex models).

- warnings
if

`TRUE`

, errors and warnings issued during the model fitting are printed below the table (only with`pdredge`

). To permanently remove the warnings, set the object's attribute`"warnings"`

to`NULL`

.- ct.args
optional list of arguments to be passed to

`coefTable`

(e.g.`dispersion`

parameter for`glm`

affecting standard errors used in subsequent`model averaging`

).- deps
a “dependency matrix” as returned by

`getAllTerms`

, attribute`"deps"`

. Can be used to fine-tune marginality exceptions.- cluster
if a valid

`"cluster"`

object is provided it is used for parallel execution of the fitting function. If`NULL`

or omitted single threaded execution is performed.With parallel calculation, an extra argument

`check`

is accepted.See

`pdredge`

for details and examples.- ...
optional arguments for the

`rank`

function. Any can be an unevaluated expression, in which case any`x`

within it will be substituted with the current model.

Kamil Bartoń

Models are fitted through repeated evaluation of the modified call extracted from
the `global.model`

(in a similar fashion to `update`

). This
approach, while having the advantage that it can be applied to most model types through the
usual formula interface, can have a considerable computational overhead.

Note that the number of combinations grows exponentially with the number of predictors (\(2^{N}\), less when interactions are present, see below).

The fitted model objects are not stored in the result. To get (a subset of)
the models, use `get.models`

on the object returned by `dredge`

.
Another way to get all the models is to run
`lapply(dredge(..., evaluate = FALSE), eval)`

,
which avoids fitting models twice.

For a list of model types that can be used as a `global.model`

see
the list of supported models. Modelling functions that
do not store a `call`

in their result should be evaluated *via* a wrapper function
created by `updateable`

.

`rank`

is found by a call to `match.fun`

and may be specified as a
function, a symbol, or as a character string specifying a function to be searched
for from the environment of the call to `dredge`

. It can be also a
one-element named list, where the first element is taken as the rank function.
The function `rank`

must accept a model object as its first argument and
always return a scalar.

By default, marginality constraints are respected, so “all possible
combinations” include only those containing interactions with their
respective main effects and all lower-order terms.
However, if `global.model`

makes an exception to this principle (e.g. due
to a nested design such as `a / (b + d)`

), this will be reflected in the
subset models.

There are three ways to constrain the resulting set of models: setting limits to
the number of terms in a model with `m.lim`

, binding
term(s) to all models using `fixed`

, and the `subset`

argument can be
used for more complex rules. For a model to be included in the selection table, its
formulation must satisfy all these conditions.

`subset`

may be an *expression* or a *matrix*.
The latter should be a lower triangular matrix with logical values, where the
columns and rows correspond to `global.model`

terms. Value
`subset["a", "b"] == FALSE`

will exclude any model containing both
`a` and `b` terms.

`demo(dredge.subset)`

has examples of using the `subset`

matrix in
conjunction with correlation matrices to exclude models containing collinear
predictors.

In the form of `expression`

, the argument `subset`

acts in a similar
fashion to that in the function `subset`

for `data.frames`

: model
terms can be referred to by name as variables in the expression, with the
difference being that are interpreted as logical values (i.e. equal to
`TRUE`

if the term exists in the model).

The expression can contain any of the `global.model`

term names, as well as
names of the `varying`

list items. `global.model`

term names take
precedence when identical to names of `varying`

, so to avoid ambiguity
`varying`

variables in `subset`

expression should be enclosed in
`V()`

(e.g. `V(family) == "Gamma"`

) assuming that
`varying`

is something like `list(family =`

`c("Gamma", ...))`

).

If item names in `varying`

are missing, the items themselves are coerced to
names. Call and symbol elements are represented as character values (*via*
`deparse`

), and anything other than numeric, logical, character and
`NULL`

values is replaced by item numbers (e.g. `varying =`

`list(family =`

`list(gaussian, Gamma)`

should be referred to as
`subset = V(family) == 2`

. This can quickly become confusing, so it
is recommended to use named lists. Examples can be found in `demo(dredge.varying)`

.

Term names appearing in `fixed`

and `subset`

must be given exactly
as they are returned by `getAllTerms(global.model)`

, which may differ
from the original term names (e.g. the interaction term components are ordered
alphabetically).

The `with(x)`

and `with(+x)`

notation indicates, respectively, any and
all interactions including the main effect term `x`

. This is only effective
with marginality exceptions. The extended form `with(x, order)`

allows to
specify the order of interaction of terms of which `x`

is a part. For
instance, `with(b, 2:3)`

selects models with at least one second- or
third-order interaction of variable `b`

. The second (positional)
argument is coerced to an integer vector. The “dot” notation `.(x)`

is
an alias for `with`

.

The special variable ``*nvar*``

(backtick-quoted), in the `subset`

expression is equal to the number of
terms in the model (**not** the number of estimated parameters).

To make the inclusion of a model term conditional on the presence of another one,
the function `dc`

(“**d**ependency **c**hain”) can be used in
the `subset`

expression. `dc`

takes any number of term names as
arguments, and allows a term to be included only if all preceding ones
are also present (e.g. `subset = dc(a, b, c)`

allows for models `a`

,
`a+b`

and `a+b+c`

but not `b`

, `c`

, `b+c`

or
`a+c`

).

`subset`

expression can have a form of an unevaluated `call`

,
`expression`

object, or a one-sided `formula`

. See ‘Examples’.

Compound model terms (such as interactions, ‘as-is’ expressions within
`I()`

or smooths in `gam`

) should be enclosed within curly brackets
(e.g. `{s(x,k=2)}`

), or backticks (like non-syntactic
names, e.g.
``s(x, k = 2)``

), except when they are arguments to `with`

or `dc`

.
Backticks-quoted names must match exactly (including whitespace) the term names
as returned by `getAllTerms`

.

`subset`

expression syntax summary

`a & b`

indicates that model terms

`a`and`b`must be present (see Logical Operators)`{log(x,2)}`

or```

`log(x, 2)`

```

represent a complex model term

`log(x, 2)`

`V(x)`

represents a

`varying`

item`x``with(x)`

indicates that at least one term containing the main effect term

`x`must be present`with(+x)`

indicates that all the terms containing the main effect term

`x`must be present`with(x, n:m)`

indicates that at least one term containing an

`n`-th to`m`-th order interaction term of`x`must be present`dc(a, b, c,...)`

‘dependency chain’:

`b`is allowed only if`a`is present, and`c`only if both`a`and`b`are present, etc.``*nvar*``

the number of terms in the model.

To simply keep certain terms in all models, use of argument `fixed`

is much
more efficient. The `fixed`

formula is interpreted in the same manner
as model formula and so the terms must not be quoted.

Use of `na.action = "na.omit"`

(R's default) or `"na.exclude"`

in
`global.model`

must be avoided, as it results with sub-models fitted to
different data sets if there are missing values. An error is thrown if it is
detected.

It is a common mistake to give `na.action`

as an argument in the call
to `dredge`

(typically resulting in an error from the `rank`

function to which the argument is passed through ‘...’), while the
correct way
is either to pass `na.action`

in the call to the global model or to set
it as a global option.

If present in the `global.model`

, the intercept will be included in all
sub-models.

There are `subset`

and
`plot`

methods, the latter creates a
graphical representation of model weights and per-model term sum of weights.
Coefficients can be extracted with `coef`

or `coefTable`

.

`pdredge`

is a parallelized version of this function (uses a
cluster).

`get.models`

, `model.avg`

. `model.sel`

for
manual model selection tables.

Possible alternatives: `glmulti`

in package glmulti
and `bestglm`

(bestglm).
`regsubsets`

in package leaps also performs all-subsets
regression.

Variable selection through regularization provided by various packages, e.g. glmnet, lars or glmmLasso.

```
# Example from Burnham and Anderson (2002), page 100:
# prevent fitting sub-models to different datasets
oop <-
options(na.action = "na.fail")
fm1 <- lm(y ~ ., data = Cement)
dd <- dredge(fm1)
subset(dd, delta < 4)
# Visualize the model selection table:
if(require(graphics)) {
par(mar = c(3,5,6,4))
plot(dd, labAsExpr = TRUE)
}
# Model average models with delta AICc < 4
model.avg(dd, subset = delta < 4)
#or as a 95% confidence set:
model.avg(dd, subset = cumsum(weight) <= .95) # get averaged coefficients
#'Best' model
summary(get.models(dd, 1)[[1]])
if (FALSE) {
# Examples of using 'subset':
# keep only models containing X3
dredge(fm1, subset = ~ X3) # subset as a formula
dredge(fm1, subset = expression(X3)) # subset as expression object
# the same, but more effective:
dredge(fm1, fixed = "X3")
# exclude models containing both X1 and X2 at the same time
dredge(fm1, subset = !(X1 && X2))
# Fit only models containing either X3 or X4 (but not both);
# include X3 only if X2 is present, and X2 only if X1 is present.
dredge(fm1, subset = dc(X1, X2, X3) && xor(X3, X4))
# the same as above, without "dc"
dredge(fm1, subset = (X1 | !X2) && (X2 | !X3) && xor(X3, X4))
# Include only models with up to 2 terms (and intercept)
dredge(fm1, m.lim = c(0, 2))
}
# Add R^2 and F-statistics, use the 'extra' argument
dredge(fm1, m.lim = c(NA, 1), extra = c("R^2", F = function(x)
summary(x)$fstatistic[[1]]))
# with summary statistics:
dredge(fm1, m.lim = c(NA, 1), extra = list(
"R^2", "*" = function(x) {
s <- summary(x)
c(Rsq = s$r.squared, adjRsq = s$adj.r.squared,
F = s$fstatistic[[1]])
})
)
# Add other information criteria (but rank with AICc):
dredge(fm1, m.lim = c(NA, 1), extra = alist(AIC, BIC, ICOMP, Cp))
options(oop)
```

Run the code above in your browser using DataLab