Last chance! 50% off unlimited learning
Sale ends in
Model comparison of nested models using parametric bootstrap methods. Implemented for some commonly applied model types.
PBmodcomp(
largeModel,
smallModel,
nsim = 1000,
ref = NULL,
seed = NULL,
cl = NULL,
details = 0
)# S3 method for merMod
PBmodcomp(
largeModel,
smallModel,
nsim = 1000,
ref = NULL,
seed = NULL,
cl = NULL,
details = 0
)
# S3 method for lm
PBmodcomp(
largeModel,
smallModel,
nsim = 1000,
ref = NULL,
seed = NULL,
cl = NULL,
details = 0
)
seqPBmodcomp(largeModel, smallModel, h = 20, nsim = 1000, cl = 1)
Two models
The number of simulations to form the reference distribution.
Vector containing samples from the reference
distribution. If NULL, this vector will be generated using
PBrefdist()
.
A seed that will be passed to the simulation of new datasets.
A vector identifying a cluster; used for calculating the reference distribution using several cores. See examples below.
The amount of output produced. Mainly relevant for debugging purposes.
For sequential computing for bootstrap p-values: The number of extreme cases needed to generate before the sampling process stops.
Søren Højsgaard sorenh@math.aau.dk
The model object
must be fitted with maximum likelihood
(i.e. with REML=FALSE
). If the object is fitted with
restricted maximum likelihood (i.e. with REML=TRUE
) then
the model is refitted with REML=FALSE
before the
p-values are calculated. Put differently, the user needs not
worry about this issue.
Under the fitted hypothesis (i.e. under the fitted small model) nsim
samples of the likelihood ratio test statistic (LRT) are generated.
Then p-values are calculated as follows:
LRT: Assuming that LRT has a chi-square distribution.
PBtest: The fraction of simulated LRT-values that are larger or equal to the observed LRT value.
Bartlett: A Bartlett correction is of LRT is calculated from the mean of the simulated LRT-values
Gamma: The reference distribution of LRT is assumed to be a gamma distribution with mean and variance determined as the sample mean and sample variance of the simulated LRT-values.
F: The LRT divided by the number of degrees of freedom is assumed to be F-distributed, where the denominator degrees of freedom are determined by matching the first moment of the reference distribution.
Ulrich Halekoh, Søren Højsgaard (2014)., A Kenward-Roger Approximation and Parametric Bootstrap Methods for Tests in Linear Mixed Models - The R Package pbkrtest., Journal of Statistical Software, 58(10), 1-30., https://www.jstatsoft.org/v59/i09/
KRmodcomp
, PBrefdist
if (FALSE) {
(fm0 <- lmer(Reaction ~ (Days|Subject), sleepstudy))
(fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy))
(fm2 <- lmer(Reaction ~ Days + I(Days^2) + (Days|Subject), sleepstudy))
NSIM <- 50 ## Simulations in parametric bootstrap; default is 1000.
## Test for no effect of Days in fm1, i.e. test fm0 under fm1
PBmodcomp(fm1, "Days", cl=1, nsim=NSIM)
PBmodcomp(fm1, ~.-Days, cl=1, nsim=NSIM)
L1 <- cbind(0, 1)
PBmodcomp(fm1, L1, cl=1, nsim=NSIM) ## FIXME
PBmodcomp(fm1, fm0, cl=1, nsim=NSIM)
anova(fm1, fm0)
## Test for no effect of Days and Days-squared in fm2, i.e. test fm0 under fm2
PBmodcomp(fm2, "(Days+I(Days^2))", cl=1, nsim=NSIM)
PBmodcomp(fm2, ~. - Days - I(Days^2), cl=1, nsim=NSIM)
L2 <- rbind(c(0, 1, 0), c(0, 0, 1))
PBmodcomp(fm2, L2, cl=1, nsim=NSIM) ## FIXME
PBmodcomp(fm2, fm0, cl=1, nsim=NSIM)
anova(fm2, fm0)
## Test for no effect of Days-squared in fm2, i.e. test fm1 under fm2
PBmodcomp(fm2, "I(Days^2)", cl=1, nsim=NSIM)
PBmodcomp(fm2, ~. - I(Days^2), cl=1, nsim=NSIM)
L3 <- rbind(c(0, 0, 1))
## PBmodcomp(fm2, L3, cl=1, nsim=NSIM) ## FIXME
PBmodcomp(fm2, fm1, cl=1, nsim=NSIM)
anova(fm2, fm1)
## Linear normal model:
sug <- lm(sugpct ~ block + sow + harvest, data=beets)
sug.h <- update(sug, .~. -harvest)
sug.s <- update(sug, .~. -sow)
PBmodcomp(sug, "harvest", nsim=NSIM, cl=1)
PBmodcomp(sug, ~. - harvest, nsim=NSIM, cl=1)
PBmodcomp(sug, sug.h, nsim=NSIM, cl=1)
anova(sug, sug.h)
## Generalized linear model
mm <- glm(ndead/ntotal ~ sex + log(dose), family=binomial, weight=ntotal, data=budworm)
mm0 <- update(mm, .~. -sex)
### Test for no effect of sex
PBmodcomp(mm, "sex", cl=1, nsim=NSIM)
PBmodcomp(mm, ~.-sex, cl=1, nsim=NSIM)
## PBmodcomp(mm, cbind(0, 1, 0), nsim=NSIM): FIXME
PBmodcomp(mm, mm0, cl=1, nsim=NSIM)
anova(mm, mm0, test="Chisq")
}
## Generalized linear mixed model (it takes a while to fit these)
if (FALSE) {
(gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
data = cbpp, family = binomial))
(gm2 <- update(gm1, .~.-period))
PBmodcomp(gm1, "period", nsim=NSIM)
PBmodcomp(gm1, ~. -period, nsim=NSIM)
PBmodcomp(gm1, gm2, nsim=NSIM)
anova(gm1, gm2)
}
if (FALSE) {
## Linear mixed effects model:
sug <- lmer(sugpct ~ block + sow + harvest + (1|block:harvest),
data=beets, REML=FALSE)
sug.h <- update(sug, .~. -harvest)
sug.s <- update(sug, .~. -sow)
anova(sug, sug.h)
PBmodcomp(sug, sug.h, nsim=NSIM, cl=1)
PBmodcomp(sug, "harvest", nsim=NSIM, cl=1)
anova(sug, sug.s)
PBmodcomp(sug, sug.s, nsim=NSIM, cl=1)
PBmodcomp(sug, "sow", nsim=NSIM, cl=1)
## Simulate reference distribution separately:
refdist <- PBrefdist(sug, sug.h, nsim=1000, cl=1)
refdist <- PBrefdist(sug, "harvest", nsim=1000, cl=1)
refdist <- PBrefdist(sug, ~.-harvest, nsim=1000, cl=1)
## Do computations with multiple processors:
## Number of cores:
(nc <- detectCores())
## Create clusters
cl <- makeCluster(rep("localhost", nc))
## Then do:
refdist <- PBrefdist(sug, sug.h, nsim=1000, cl=cl)
## It is recommended to stop the clusters before quitting R:
stopCluster(cl)
}
lm1 <- lm(dist~speed+I(speed^2), data=cars)
PBmodcomp(lm1, .~.-speed, cl=2)
PBmodcomp(lm1, .~.-I(speed^2), cl=2)
Run the code above in your browser using DataLab