buildmer: Construct and fit as complete a model as possible and perform stepwise elimination

Description

With the default options, buildmer() will do two things:

Determine the order of the effects in your model, based on their contribution to the log-likelihood. This identifies the `maximal model', which is the model containing either all effects specified by the user, or subset of those effects that still allow the model to converge, ordered such that the most information-rich effects have made it in.
Perform backward stepwise elimination based on the change in log-likelihood.

The final model is returned in the model slot of the returned buildmer object. All functions in the buildmer package are aware of the distinction between (f)REML and ML, and know to divide chi-square p-values by 2 when comparing models differing only in random slopes (see Pinheiro & Bates 2000). The steps executed above can be changed using the direction argument, allowing for arbitrary chains of, for instance, forward-backward-forward stepwise elimination (although using more than one elimination method on the same data is not recommended). The criterion for determining the importance of terms in the ordering stage and the elimination of terms in the elimination stage can also be changed, using the crit argument.

Usage

buildmer(formula, data = NULL, family = "gaussian", cl = NULL,
  direction = c("order", "backward"), crit = "LRT", include = NULL,
  reduce.fixed = TRUE, reduce.random = TRUE, calc.anova = TRUE,
  calc.summary = TRUE, ddf = "Wald", quiet = FALSE, ...)

Arguments

formula

The model formula for the maximal model you would like to fit, if possible.

data

The data to fit the models to.

family

The error distribution to use.

An optional cluster object as returned by function makeCluster() from package parallel to use for parallelizing the evaluation of terms.

direction

Character string or vector indicating the direction for stepwise elimination; possible options are 'order' (order terms by their contribution to the model), 'backward' (backward elimination), 'forward' (forward elimination, implies order). The default is the combination c('order','backward'), to first make sure that the model converges and to then perform backward elimination; other such combinations are perfectly allowed.

crit

Character string or vector determining the criterion used to test terms for elimination. Possible options are 'LRT' (likelihood-ratio test; this is the default), 'LL' (use the raw -2 log likelihood), 'AIC' (Akaike Information Criterion), and 'BIC' (Bayesian Information Criterion).

include

A character vector of terms that will be kept in the model at all times. These do not need to be specified separately in the formula argument.

reduce.fixed

Logical indicating whether to reduce the fixed-effect structure.

reduce.random

Logical indicating whether to reduce the random-effect structure.

calc.anova

Logical indicating whether to also calculate the ANOVA table for the final model after term elimination.

calc.summary

Logical indicating whether to also calculate the summary table for the final model after term elimination.

ddf

The method used for calculating p-values if calc.anova=TRUE or calc.summary=TRUE. Options are 'Wald' (default), 'Satterthwaite' (if package lmerTest is available), 'Kenward-Roger' (if packages lmerTest and pbkrtest are available), and 'lme4' (no p-values).

quiet

Logical indicating whether to suppress progress messages.

...

Additional options to be passed to lme4 or gamm4. (They will also be passed to (g)lm in so far as they're applicable, so you can use arguments like subset=... and expect things to work. The single exception is the control argument, which is assumed to be meant only for lme4 and not for (g)lm, and will not be passed on to (g)lm.)

Examples

Run this code

# NOT RUN {
library(buildmer)
m <- buildmer(Reaction ~ Days + (Days|Subject),lme4::sleepstudy)

# }
# NOT RUN {
# Only finding the maximal model, with importance of effects measured by AIC, parallelizing the
# model evaluations using two cores, using the bobyqa optimizer and asking for verbose output
library(parallel)
cl <- makeCluster(2,outfile='')
control <- lme4::lmerControl(optimizer='bobyqa')
clusterExport(cl,'control') #this is not done automatically for '...' arguments!
m <- buildmer(f1 ~ vowel*timepoint*following + (vowel*timepoint*following|participant) +
              (timepoint|word),data=vowels,cl=cl,direction='order',crit='AIC',calc.anova=FALSE,
              calc.summary=FALSE,control=control,verbose=2)
# The maximal model is: f1 ~ vowel + timepoint + vowel:timepoint + following +
# timepoint:following +vowel:following + vowel:timepoint:following + (1 + timepoint +
# following + timepoint:following | participant) + (1 + timepoint | word)
# Now do backward stepwise elimination (result: f1 ~ vowel + timepoint + vowel:timepoint +
# following + timepoint:following + (1 + timepoint + following + timepoint:following |
# participant) + (1 + timepoint | word))
buildmer(formula(m@model),data=vowels,direction='backward',crit='AIC',control=control)
# Or forward (result: retains the full model)
buildmer(formula(m@model),data=vowels,direction='forward',crit='AIC',control=control)
# Print summary with p-values based on Satterthwaite denominator degrees of freedom
summary(m,ddf='Satterthwaite')

# Example for fitting a model without correlations in the random part
# (even for factor variables!)
# 1. Create explicit columns for factor variables
library(buildmer)
vowels <- cbind(vowels,model.matrix(~vowel,vowels))
# 2. Create formula with diagonal covariance structure
form <- diag(f1 ~ (vowel1+vowel2+vowel3+vowel4)*timepoint*following + 
	     ((vowel1+vowel2+vowel3+vowel4)*timepoint*following | participant) +
	     (timepoint | word))
# 3. Convert formula to buildmer terms list
terms <- tabulate.formula(form)
# 4. Assign the different vowelN columns to identical blocks
terms[ 2: 5,'block'] <- 'same1'
terms[ 7:10,'block'] <- 'same2'
terms[12:15,'block'] <- 'same3'
terms[17:20,'block'] <- 'same4'
terms[22:25,'block'] <- 'same5'
terms[27:30,'block'] <- 'same6'
terms[32:35,'block'] <- 'same7'
terms[37:40,'block'] <- 'same8'
# 5. Directly pass the terms object to buildmer(), using the hidden 'dep' argument to specify
# the dependent variable
m <- buildmer(terms,data=vowels,dep='f1')
# }

Run the code above in your browser using DataLab