Following a call to the lessR
function Regression
, in which the returned values of the function are saved into an object, allows the default plots generated by Regression
to be accessed one at a time. The specific motivation for this function is to allow custom placement of the graphs from the regression analysis from within knitr
. Usually the graphics=FALSE
parameter is set on the call to Regression
within knitr
to suppress the normal graphic output that leads to the generation of the graphs at the beginning of the knitr
output.
regPlot(out, type, digits.d=NULL, pred.intervals=TRUE,
res.sort=c("cooks","rstudent","dffits","off"),
res.rows=NULL, cooks.cut=1, scatter.coef=NULL,
pdf=FALSE, width=5, height=5, manage.gr=FALSE, …)
The object returned by the lessR
function Regression
.
Type of plot: 1 plots the scatter plot for a single predictor variable, or the scatter plot matrix for multiple predictors. If a single scatter plot, then the confidence and prediction intervals are included. 2 plots the density and histogram of residuals and 3 plots a scatter plot of the residuals with the fitted values.
For the Basic Analysis, the number of decimal digits, set by default to at least 3 or the largest number of digits in the values of the response variable plus 1.
If set to FALSE
, the scatter plot for a single predictor
with the response does not contain prediction and confidence intervals.
Default is "cooks"
, for specifying Cook's distance as the sort
criterion for the display of the rows of data and associated residuals. Other values
are "rstudent"
for externally Studentized residuals, "dffits"
for dffits
and "off"
to not sort the rows of data.
Default is 20, which lists the first 20 rows of data sorted by the
specified sort criterion. To disable residuals, specify a value of 0. To see
the output for all observations, specify a value of "all"
.
Cutoff value of Cook's Distance at which observations with a larger value are flagged in red and labeled in the resulting scatterplot of Residuals and Fitted Values. Default value is 1.0.
Display the correlation coefficients in the upper triangle of the scatterplot matrix.
If TRUE
, then graphics are written to pdf files.
Width of the pdf file in inches.
Height of the pdf file in inches.
Usually leave FALSE
. Refers to graphic management of the lessR
system.
Other parameter values for R function lm
which provides the core computations.
OVERVIEW
The ability to separate plots is particularly useful with knitr
to break up the output to intersperse comments between the plots. For Plot 1, for single predictor a scatter plot with the regression line and confidence and prediction intervals is produced. Otherwise a scatter plot matrix of all the variables in the models is obtained.
To help assess the validity of the model, Plot 2 is of the distribution of the residuals, histogram and density plots, both general and normal. Plot 3 plots the residuals against the fitted value and also identifies the points with the largest values of Cook's distance.
Gerbing, D. W. (2014). R Data Analysis without Programming, Chapters 9 and 10, NY: Routledge.
# NOT RUN {
# read internal data set
mydata <- rd("Reading", format="lessR", quiet=TRUE)
# do regression analysis, save result into out
reg.out <- reg(Reading ~ Verbal)
# The full output already contains these plots, obtained by
# entering the name of the saved object
reg.out
# Particularly for knitr it is useful to obtain the plots
# separately from the full output
# Get the scatter plot of the data with the regression line
# and prediction and confidence intervals
regPlot(reg.out, 1)
# Can use with multiple regression for the scatter plot matrix
r <- reg(Reading ~ Verbal + Absent + Income)
regPlot(r, 1, scatter.coef=TRUE)
# }
Run the code above in your browser using DataLab