ANOVA.boot: Residual and wild bootstrap in 1-way and 2-way ANOVA

Description

This function performs the residual bootstrap as described by Efron (1979) and wild bootstrap as described by Wu (1986) for ANOVA hypothesis testing. Linear models incorporating categorical and/or quantitative predictor variables with a quantitative response are allowed. The function output creates the bootstrap null distribution for each term to be tested. Estimation is performed via least squares and only Type I sum of squares are calculated.

Usage

ANOVA.boot(formula, B = 1000, type = "residual", wild.dist = "normal", 
            seed = NULL, data = NULL, keep.boot.resp = FALSE)

Arguments

formula

input a linear model formula of the form response~predictors as you would in the lm() function. All variables must contain non-missing entries.

number of bootstrap samples. This should be a large, positive integer value.

type

type of bootstrap to perform. Select either "residual" for residual bootstrap or "wild" for wild bootstrap.

wild.dist

distribution used to create the wild bootstrap weights for the residuals. Allowed distributions include "normal", "uniform", "exponential", "laplace", "lognormal", "gumbel", "t5", "t8", and "t14". The numbers after the t-distributions indicate the degrees of freedom. Any selected distribution creates weights with mean 0 and variance 1 from the named distribution.

seed

optionally, set a value for the seed for the bootstrap sample generation. The default NULL will pick a random value for the seed.

data

optionally, input the name of the dataset where variables appearing in the model are stored.

keep.boot.resp

a boolean indicating whether the list of returns includes raw bootstrap responses. Setting this to TRUE may not be possible for larger datasets or too many bootstrap samples due to memory usage.

Value

terms

names of the terms/rows of the ANOVA table. These correspond to each predictor variable input to the formula.

degrees of freedom associated with each term/row in the ANOVA table. These correspond to the number of categories in each predictor variable (or are 1 for quantitative predictors)

origFStats

original F-statistic value. Same value as obtained by aov() using type I sum of squares.

origSSE

original sum of squares, error. Same value as obtained by aov() using type I sum of squares.

origSSTr

original sum of squares, treatment. Vector containing the sum of squares for each term in the ANOVA model. These are the same values as obtained by aov() using type I sum of squares.

bootFStats

matrix containing the bootstrap F statistics. Each column corresponds to a term in the ANOVA table. There are B rows.

bootSSE

matrix containing the bootstrap sum of squares, error. Each column corresponds to a term in the ANOVA table. There are B rows. These are calculated using type I sum of squares.

bootSSTr

matrix containing the bootstrap sum of squares, treatment. Each column corresponds to a term in the ANOVA table. There are B rows. These are calculated using type I sum of squares.

`p-values`

vector containing the bootstrap p-values for each predictor term in the ANOVA model. These are calculated by counting the number of bootstrap test statistics which are greater than the original observed test statistic and dividing by B

Details

Currently, the user must manipulate the output of the function manually to view the bootstrap ANOVA table components and visualize the null distribution. More convenient/streamlined output is expected in future package versions.

Thanks to Bochuan Lyu who helped to coding to this function.

References

Efron, B. (1979). "Bootstrap methods: Another look at the jackknife." Annals of Statistics. Vol. 7, pp.1-26.

Wu, C.F.J. (1986). "Jackknife, Bootstrap, and Other Resampling Methods in Regression Analysis." Annals of Statistics. Vol. 14, No. 4, pp.1261 - 1295.

Examples

Run this code

# NOT RUN {
data(mtcars)         #load an example dataset
myANOVA2 <- ANOVA.boot(mpg~as.factor(cyl)*as.factor(am), data=mtcars)
myANOVA2$`p-values`  #bootstrap p-values for 2-way interactions model

myANOVA1 <- ANOVA.boot(mpg~as.factor(cyl), data=mtcars)
myANOVA1$`p-values` #bootstrap p-values for 1-way model

myANOVA2a <- ANOVA.boot(mpg~as.factor(cyl)+as.factor(am), data=mtcars)
myANOVA2a$`p-values` #bootstrap p-values for 1-way additive model

# }

Run the code above in your browser using DataLab