Last chance! 50% off unlimited learning
Sale ends in
Construct a data frame containing the model data, partial residuals for all quantitative predictors, and predictor effects, for use in residual diagnostic plots and other analyses. The result is in tidy form (one row per predictor per observation), allowing it to be easily manipulated for plots and simulations.
partial_residuals(fit, predictors = everything())
Data frame (tibble) containing the model data and residuals in tidy form. There is one row per selected predictor per observation. All predictors are included as columns, plus the following additional columns:
Row number of this observation in the original model data frame.
Name of the predictor this row gives the partial residual for.
Value of the predictor this row gives the partial residual for.
Partial residual for this predictor for this observation.
Predictor effect
The model to obtain residuals for. This can be a model fit with
lm()
or glm()
, or any model with a predict()
method that accepts a
newdata
argument.
Predictors to calculate partial residuals for. Defaults to
all predictors, skipping factors. Predictors can be specified using
tidyselect syntax; see help("language", package = "tidyselect")
and the
examples below.
To define partial residuals, we must distinguish between the predictors, the measured variables we are using to fit our model, and the regressors, which are calculated from them. In a simple linear model, the regressors are equal to the predictors. But in a model with polynomials, splines, or other nonlinear terms, the regressors may be functions of the predictors.
For example, in a regression with a single predictor
Similarly, if we have predictors
Partial residuals are defined in terms of the predictors, not the regressors, and are intended to allow us to see the shape of the relationship between a particular predictor and the response, and to compare it to how we have chosen to model it with regressors. Partial residuals are not useful for categorical (factor) predictors, and so these are omitted.
Consider a linear model where
Choose a predictor
If
Setting
Consider a generalized linear model where
Let
Choose a predictor
The partial residual is again
In linear regression, because the residuals
In generalized linear models, this is approximately true if the link function
Additionally, the function
Factor predictors (as factors, logical, or character vectors) are detected
automatically and omitted. However, if a numeric variable is converted to
factor in the model formula, such as with y ~ factor(x)
, the function
cannot determine the appropriate type and will raise an error. Create factors
as needed in the source data frame before fitting the model to avoid this
issue.
R. Dennis Cook (1993). "Exploring Partial Residual Plots", Technometrics, 35:4, 351-362. tools:::Rd_expr_doi("10.1080/00401706.1993.10485350")
Cook, R. Dennis, and Croos-Dabrera, R. (1998). "Partial Residual Plots in Generalized Linear Models." Journal of the American Statistical Association 93, no. 442: 730–39. tools:::Rd_expr_doi("10.2307/2670123")
Fox, J., & Weisberg, S. (2018). "Visualizing Fit and Lack of Fit in Complex Regression Models with Predictor Effect Plots and Partial Residuals." Journal of Statistical Software, 87(9). tools:::Rd_expr_doi("10.18637/jss.v087.i09")
binned_residuals()
for the related binned residuals;
augment_longer()
for a similarly formatted data frame of ordinary
residuals; vignette("linear-regression-diagnostics")
,
vignette("logistic-regression-diagnostics")
, and
vignette("other-glm-diagnostics")
for examples of plotting and
interpreting partial residuals
fit <- lm(mpg ~ cyl + disp + hp, data = mtcars)
partial_residuals(fit)
# You can select predictors with tidyselect syntax:
partial_residuals(fit, c(disp, hp))
# Predictors with multiple regressors are supported:
fit2 <- lm(mpg ~ poly(disp, 2), data = mtcars)
partial_residuals(fit2)
# Allowing an interaction by number of cylinders is fine, but partial
# residuals are not generated for the factor. Notice the factor must be
# created first, not in the model formula:
mtcars$cylinders <- factor(mtcars$cyl)
fit3 <- lm(mpg ~ cylinders * disp + hp, data = mtcars)
partial_residuals(fit3)
Run the code above in your browser using DataLab