Learn R Programming

reverseR (version 0.2)

regionInfl: Identify regions of significance reversal and influence measure threshold

Description

Identifies regions of an (univariate) linear model in which a future data point would result in either
a) significance reversal, or
b) any selected influence measure as given in crit exceed its threshold value.
This is intended mainly for visual/didactical purposes.

Usage

regionInfl(model, div.x = 20, div.y = 20, grid = TRUE, pred.int = TRUE,
       crit = c("P", "dfb.Slope", "dffit", "cov.r", "cook.d", "hat", "hadi",
       "sR", "cdr", "Si"), cex.grid = 0.5, alpha = 0.05, xlim = NULL, ylim = NULL, ...)

Value

A plot with the regions marked in orange or green, and the grid matrix (grid) including the criterion outcome in 1 (green) or 0 (orange).

Arguments

model

the linear model of class lm.

div.x

the number of grid division for the x-axis.

div.y

the number of grid division for the y-axis.

grid

logical. Show the grid lines on the plot or not.

pred.int

logical. Show the 95% prediction interval on the plot or not.

crit

the criterion to use. Either "P" for significance reversal or any of the influence measures given there.

cex.grid

size of the grid points.

alpha

the \(\alpha\)-level to be set as threshold.

xlim

similar to xlim, a 2-element vector for the x-axis limits, overrides fac.x.

ylim

similar to ylim, a 2-element vector for the y-axis limits, overrides fac.y.

...

other parameters to be supplied to plot or lmInfl.

Author

Andrej-Nikolai Spiess

Details

For a given linear model \(y_i = \beta_0 + \beta_1 x_i + \varepsilon\), each \((a, b)\) pair from a grid of values \((a_1 \ldots a_j, b_1 \ldots b_k)\) is added to the data, and an updated model \((y_i, b_k) = \beta_0 + \beta_1 (x_i, a_j) + \varepsilon\) is created. If the updated model's \(p \leq \alpha\) or any of the influence measures does not exceed its published threshold, it is plotted in green, otherwise in orange. If outlier = TRUE, a possible reverser is eliminated prior to analysis but visualized in the plot.

Examples

Run this code
## Model with p = 0.014 
set.seed(7)
N <- 20
x <- runif(N, 1, 100)
y <- 0.05 * x + rnorm(N, 0, 2)
LM1 <- lm(y ~ x)
summary(LM1)
regionInfl(LM1, crit = "P", div.x = 20, div.y = 20, cex.grid = 1, 
           xlim = c(-20, 120), ylim = c(-5, 10))

Run the code above in your browser using DataLab