Learn R Programming

reverseR (version 0.2)

bootLM: Nonparametric/Parametric bootstrap linear model

Description

Nonparametric and parametric bootstrap (sampling cases, residuals or distributions with replacement) method for parameter estimation and confidence interval of a linear model.

Usage

bootLM(model, type = c("cases", "residuals", "residuals2", "parametric"), 
       R = 10000, alpha = 0.05, ret.models = FALSE)

Value

A dataframe containing the estimated coefficients, their standard error, lower an upper confidence values and p-values. If ret.models = TRUE a list with all R models is returned.

Arguments

model

an lm model.

type

what to bootstrap. See "Details".

R

number of bootstrap samples.

alpha

the \(\alpha\)-level to use as the threshold border.

ret.models

logical. If TRUE, the R models are returned as a list.

Author

Andrej-Nikolai Spiess

Details

If type = "cases", for all (\(x_i, y_i\)) datapoints, linear models are created by sampling R times - with replacement - from \(n \in \{1 \ldots N\}\) and building models \(Y_n = X_n\beta + \varepsilon\). This is also known as the .632-bootstrap, because the samples will, on average, contain \(1 - e^{-1} = 0.632\) unique elements. If type = "residuals", for all residuals (\(r_i = y_i - \hat{y}_i\)), linear models are created by sampling R times - with replacement - from \(n \in (1 \ldots N)\) and building models \(\hat{Y}_i + r_n = X_i\beta + \varepsilon\). If type = "residuals2" is selected, scaled and centered residuals \(r_n = \frac{r_i}{\sqrt{1 - h_{ii}}} - \bar{r}\) according to Davison & Hinkley are used. In the "parametric" bootstrap, \(n\) values drawn from a normal distribution \(j_n \in \mathcal{N}(0, \sigma)\), where \(\sigma = \sqrt{\frac{\sum(r_i)^2}{n - p}}\), are added to the fitted values, and linear models are created \(\hat{Y}_i + j_n = X_i\beta + \varepsilon\). Parameter estimates are obtained from each sampling, from which the average \(\overline{P_{n}}\) and standard error \(\hat{\sigma}\) is calculated as well as a quantile based confidence interval. p-values are calculated through inversion of the confidence interval.

References

An Introduction to the Bootstrap.
Efron B, Tibshirani R.
Chapman & Hall (1993).

The Bootstrap and Edgeworth Expansion.
Hall P.
Springer, New York (1992).

Modern Statistics with R.
Thulin M.
Eos Chasma Press, Uppsala (2021).

Bootstrap methods and their application.
Davison AC, Hinkley DV.
Cambridge University Press (1997).

Examples

Run this code
## Example with single influencer (#18) and insignificant model (p = 0.115),
## using case bootstrap.
set.seed(123)
a <- 1:20
b <- 5 + 0.08 * a + rnorm(20, 0, 1)
LM <- lm(b ~ a)
bootLM(LM, R = 100)

## using residuals bootstrap.
bootLM(LM, R = 100, type = "residuals")

Run the code above in your browser using DataLab