stat_fit_residuals
fits a linear model and returns
residuals ready to be plotted as points.
stat_fit_residuals(
mapping = NULL,
data = NULL,
geom = "point",
method = "lm",
method.args = list(),
formula = NULL,
resid.type = NULL,
position = "identity",
na.rm = FALSE,
orientation = NA,
show.legend = FALSE,
inherit.aes = TRUE,
...
)
A layer specific dataset - only needed if you want to override the plot defaults.
The geometric object to use display the data
function or character If character, "lm", "rlm", and "rq"
are implemented. If a function, it must support parameters formula
and data
.
named list with additional arguments.
a "formula" object. Using aesthetic names instead of original variable names.
character passed to residuals()
as argument for
type
.
The position adjustment to use for overlapping points on this layer
a logical indicating whether NA values should be stripped before the computation proceeds.
character Either "x" or "y" controlling the default for
formula
.
logical. Should this layer be included in the legends?
NA
, the default, includes if any aesthetics are mapped. FALSE
never includes, and TRUE
always includes.
If FALSE
, overrides the default aesthetics, rather
than combining with them. This is most useful for helper functions that
define both data and aesthetics and should not inherit behaviour from the
default plot specification, e.g. borders
.
Data frame with same nrow
as data
as subset for each group containing five numeric variables.
x coordinates of observations
residuals from fitted values
absolute residuals from the fit
.
By default stat(y.resid)
is mapped to the y
aesthetic.
This stat can be used to automatically plot residuals as points in a
plot. At the moment it supports only linear models fitted with function
lm()
. This stat only generates the residuals.
A ggplot statistic receives as data a data frame that is not the one passed
as argument by the user, but instead a data frame with the variables mapped
to aesthetics. In other words, it respects the grammar of graphics and
consequently within the model formula
names of
aesthetics like $x$ and $y$ should be used intead of the original variable
names, while data is automatically passed the data frame. This helps ensure
that the model is fitted to the same data as plotted in other layers.
Other ggplot statistics for model fits:
stat_fit_augment()
,
stat_fit_deviations()
,
stat_fit_glance()
,
stat_fit_tb()
,
stat_fit_tidy()
# NOT RUN {
# generate artificial data
set.seed(4321)
x <- 1:100
y <- (x + x^2 + x^3) + rnorm(length(x), mean = 0, sd = mean(x^3) / 4)
my.data <- data.frame(x, y)
# plot residuals from linear model
ggplot(my.data, aes(x, y)) +
geom_hline(yintercept = 0, linetype = "dashed") +
stat_fit_residuals(formula = y ~ x)
# plot residuals from linear model with y as explanatory variable
ggplot(my.data, aes(x, y)) +
geom_vline(xintercept = 0, linetype = "dashed") +
stat_fit_residuals(formula = x ~ y) +
coord_flip()
# give a name to a formula
my.formula <- y ~ poly(x, 3, raw = TRUE)
# plot residuals from linear model
ggplot(my.data, aes(x, y)) +
geom_hline(yintercept = 0, linetype = "dashed") +
stat_fit_residuals(formula = my.formula) +
coord_flip()
ggplot(my.data, aes(x, y)) +
geom_hline(yintercept = 0, linetype = "dashed") +
stat_fit_residuals(formula = my.formula, resid.type = "response")
# plot residuals from robust regression
ggplot(my.data, aes(x, y)) +
geom_hline(yintercept = 0, linetype = "dashed") +
stat_fit_residuals(formula = my.formula, method = "rlm")
# plot residuals with weights indicated by colour
my.data.outlier <- my.data
my.data.outlier[6, "y"] <- my.data.outlier[6, "y"] * 10
ggplot(my.data.outlier, aes(x, y)) +
stat_fit_residuals(formula = my.formula, method = "rlm",
mapping = aes(colour = after_stat(weights)),
show.legend = TRUE) +
scale_color_gradient(low = "red", high = "blue", limits = c(0, 1),
guide = "colourbar")
# plot weighted residuals with weights indicated by colour
ggplot(my.data.outlier) +
stat_fit_residuals(formula = my.formula, method = "rlm",
mapping = aes(x = x,
y = stage(start = y, after_stat = y * weights),
colour = after_stat(weights)),
show.legend = TRUE) +
scale_color_gradient(low = "red", high = "blue", limits = c(0, 1),
guide = "colourbar")
# plot residuals from quantile regression (median)
ggplot(my.data, aes(x, y)) +
geom_hline(yintercept = 0, linetype = "dashed") +
stat_fit_residuals(formula = my.formula, method = "rq")
# plot residuals from quantile regression (upper quartile)
ggplot(my.data, aes(x, y)) +
geom_hline(yintercept = 0, linetype = "dashed") +
stat_fit_residuals(formula = my.formula, method = "rq",
method.args = list(tau = 0.75))
# inspecting the returned data
if (requireNamespace("gginnards", quietly = TRUE)) {
library(gginnards)
ggplot(my.data, aes(x, y)) +
stat_fit_residuals(formula = my.formula, resid.type = "working",
geom = "debug")
ggplot(my.data, aes(x, y)) +
stat_fit_residuals(formula = my.formula, method = "rlm",
geom = "debug")
}
# }
Run the code above in your browser using DataLab