Learn R Programming

MuMIn (version 0.13.3)

model.avg: Model averaging

Description

Model averaging based on an information criterion.

Usage

model.avg(m1, ..., beta = FALSE, method = c("0", "NA"), rank = NULL,
	rank.args = NULL, alpha = 0.05)

## S3 method for class 'averaging': coef(object, ...)

## S3 method for class 'averaging': predict(object, newdata, se.fit = NULL, interval = NULL, type=NULL, ...)

Arguments

m1
A fitted model object or a list of such objects.
beta
Logical, should standardized coefficients be returned rather than normal ones?
method
If set to 0 (default), terms missing in one model are assumed to be 0's, otherwise they are omitted from the weighted average. See Details.
rank
Custom rank function (information criterion) to use instead of AICc, e.g. QAIC or BIC, may be omitted if m1 is a list returned by dredge. See Details.
rank.args
Optional list of arguments for the rank function. If one is an expression, an x within it is substituted with a current model.
alpha
Significance level for calculatinq confidence intervals.
object
An object returned by model.avg.
newdata
An optional data frame in which to look for variables with which to predict. If omitted, the fitted values are used.
se.fit,interval
Currently not used.
type
Ignored. Only predictions on the link scale are allowed. Warning is given if user tries something else here.
...
for model.avg - more fitted model objects, for predict - arguments to be passed to respective predict method

Value

  • Ann object of class averaging, which is a list with elements:
  • summaryModel table with deviance, AICc, Delta and weight.
  • coefficientsthe model coefficients
  • variancevariance of coefficients
  • avg.modelaveraged model summary (data.frame with columns: coef - averaged coefficients, var - unconditional variance estimator, ase - adjusted standard error estimator, lci, uci - unconditional confidence intervals)
  • relative.importancerelative variable importances
  • variable.codesVariable names with numerical codes used in the summary
  • relative.importanceRelative importance of variables
  • weights
  • beta(logical) were standardized coefficients used?
  • modelthe model matrix, analogical to one that would be used in a single model.
  • residualsthe residuals (response minus fitted values).

encoding

utf-8

Details

rank is found by a call to match.fun and typically is specified as a function or a symbol (e.g. a backquoted name) or a character string specifying a function to be searched for from the environment of the call to lapply.

Function rank must be able to accept model as a first argument and must always return a scalar.

predict.averaging supports method="NA" only for linear, fixed effect models. In other cases (e.g. nonlinear or mixed models), prediction is obtained using brute force, i.e. by calling predict on each component model and weighted averaging the results, which is equivalent to assuming that missing coefficients equal zero (method="0").

Apart from predict and coef, other default methods, such as formula and residuals may be used.

References

Burnham, K. P. and Anderson, D. R (2002) Model selection and multimodel inference: a practical information-theoretic approach. 2nd ed.

See Also

dredge, get.models. QAIC has examples of using custom rank function.

Examples

Run this code
# Example from Burnham and Anderson (2002), page 100:
data(Cement)
lm1 <- lm(y ~ ., data = Cement)
dd <- dredge(lm1)
dd

#models with delta.aicc < 4
model.avg(get.models(dd, subset = delta < 4)) # get averaged coefficients

#or as a 95\% confidence set:
top.models <- get.models(dd, cumsum(weight) <= .95)

model.avg(top.models) # get averaged coefficients

#topmost model:
top.models[[1]]

# using BIC (Schwarz's Bayesian criterion) to rank the models
BIC <- function(x) AIC(x, k=log(length(residuals(x))))
mav <- model.avg(top.models, rank=BIC)


# Predicted values
nseq <- function(x, len=length(x)) seq(min(x, na.rm=TRUE), max(x, na.rm=TRUE),
	length=len)

# New predictors: X1 along the range of original data, other variables held
#	constant at their means
newdata <- as.data.frame(lapply(lapply(Cement[1:5], mean), rep, 25))
newdata$X1 <- nseq(Cement$X1, nrow(newdata))

# Predictions from each of the models in a set:
pred <- sapply(top.models, predict, newdata=newdata)
# Add predictions from the models averaged using two methods:
pred <- cbind(pred,
	averaged.0=predict(model.avg(top.models, method="0"), newdata),
	averaged.NA=predict(model.avg(top.models, method="NA"), newdata)
	)

matplot(x=newdata$X1, y=pred, type="l", lwd=c(rep(1,ncol(pred)-2), 2, 2),
	xlab="X1", ylab="y")

legend("topleft",
	legend=c(lapply(top.models, formula),
		paste("Averaged model (method=", c("0", "NA"), ")", sep="")),
	col=1:6, lty=1:5, lwd=c(rep(1,ncol(pred)-2), 2, 2), cex = .75
	)


# Example with gam models (based on "example(gam)")
require(mgcv)
dat <- gamSim(1, n = 500, dist="poisson", scale=0.1)

gam1 <- gam(y ~ s(x0) + s(x1) + s(x2) +  s(x3) + (x1+x2+x3)^2,
	family = poisson, data = dat, method = "REML")

cat(dQuote(getAllTerms(gam1)), "\n")

# include only models with smooth OR linear term (but not both) for each variable:
dd <- dredge(gam1, subset=(!`s(x1)` | !x1) & (!`s(x2)` | !x2) & (!`s(x3)` | !x3))
# ...this may take a while.

subset(dd, cumsum(weight) < .95)

top.models <- get.models(dd, cumsum(weight) <= .95)

newdata <- as.data.frame(lapply(lapply(dat, mean), rep, 50))
newdata$x1 <- nseq(dat$x1, nrow(newdata))
pred <- cbind(
	sapply(top.models, predict, newdata=newdata),
	averaged=predict(model.avg(top.models), newdata)
)

matplot(x=newdata$x1, y=pred, type="l", lwd=c(rep(1,ncol(pred)-2), 2, 2),
	xlab="x1", ylab="y")

Run the code above in your browser using DataLab