regionInfl: Identify regions of significance reversal and influence measure threshold

Description

Identifies regions of an (univariate) linear model in which a future data point would result in either
a) significance reversal, or
b) any selected influence measure as given in crit exceed its threshold value.
This is intended mainly for visual/didactical purposes.

Usage

regionInfl(model, div.x = 20, div.y = 20, grid = TRUE, pred.int = TRUE,
       crit = c("P", "dfb.Slope", "dffit", "cov.r", "cook.d", "hat", "hadi",
       "sR", "cdr", "Si"), cex.grid = 0.5, alpha = 0.05, xlim = NULL, ylim = NULL, ...)

Value

A plot with the regions marked in orange or green, and the grid matrix (grid) including the criterion outcome in 1 (green) or 0 (orange).

Arguments

model: the linear model of class lm.
div.x: the number of grid division for the x-axis.
div.y: the number of grid division for the y-axis.
grid: logical. Show the grid lines on the plot or not.
pred.int: logical. Show the 95% prediction interval on the plot or not.
crit: the criterion to use. Either "P" for significance reversal or any of the influence measures given there.
cex.grid: size of the grid points.
alpha: the \(\alpha\)-level to be set as threshold.
xlim: similar to xlim, a 2-element vector for the x-axis limits, overrides fac.x.
ylim: similar to ylim, a 2-element vector for the y-axis limits, overrides fac.y.
...: other parameters to be supplied to plot or lmInfl.

Author

Andrej-Nikolai Spiess

Details

For a given linear model \(y_i = \beta_0 + \beta_1 x_i + \varepsilon\), each \((a, b)\) pair from a grid of values \((a_1 \ldots a_j, b_1 \ldots b_k)\) is added to the data, and an updated model \((y_i, b_k) = \beta_0 + \beta_1 (x_i, a_j) + \varepsilon\) is created. If the updated model's \(p \leq \alpha\) or any of the influence measures does not exceed its published threshold, it is plotted in green, otherwise in orange. If outlier = TRUE, a possible reverser is eliminated prior to analysis but visualized in the plot.

Examples

Run this code

## Model with p = 0.014 
set.seed(7)
N <- 20
x <- runif(N, 1, 100)
y <- 0.05 * x + rnorm(N, 0, 2)
LM1 <- lm(y ~ x)
summary(LM1)
regionInfl(LM1, crit = "P", div.x = 20, div.y = 20, cex.grid = 1, 
           xlim = c(-20, 120), ylim = c(-5, 10))

Run the code above in your browser using DataLab