plot_predictions: Plot predicted effects from make_predictions

Description

The companion function to make_predictions(). This takes data from make_predictions() (or elsewhere) and plots them like effect_plot(), interact_plot(), and cat_plot(). Note that some arguments will be ignored if the inputted predictions

Usage

plot_predictions(predictions, pred = NULL, modx = NULL, mod2 = NULL,
  resp = NULL, data = NULL, geom = c("point", "line", "bar",
  "boxplot"), plot.points = FALSE, interval = FALSE,
  pred.values = NULL, modx.values = NULL, mod2.values = NULL,
  linearity.check = FALSE, facet.modx = FALSE, x.label = NULL,
  y.label = NULL, pred.labels = NULL, modx.labels = NULL,
  mod2.labels = NULL, main.title = NULL, legend.main = NULL,
  color.class = NULL, line.thickness = 1.1, vary.lty = NULL,
  jitter = 0, weights = NULL, rug = FALSE, rug.sides = "b",
  force.cat = FALSE, point.shape = FALSE, geom.alpha = NULL,
  dodge.width = NULL, errorbar.width = NULL,
  interval.geom = c("errorbar", "linerange"), pred.point.size = 3.5,
  point.size = 1, ...)

Arguments

predictions

Either the output from make_predictions() (an object of class "predictions") or a data frame of predicted values.

pred

The name of the predictor variable involved in the interaction. This can be a bare name or string.

modx

The name of the moderator variable involved in the interaction. This can be a bare name or string.

mod2

Optional. The name of the second moderator variable involved in the interaction. This can be a bare name or string.

resp

What is the name of the response variable? Use a string.

data

Optional, default is NULL. You may provide the data used to fit the model. This can be a better way to get mean values for centering and can be crucial for models with variable transformations in the formula (e.g., log(x)) or polynomial terms (e.g., poly(x, 2)). You will see a warning if the function detects problems that would likely be solved by providing the data with this argument and the function will attempt to retrieve the original data from the global environment.

geom

For factor predictors only: What type of plot should this be? There are several options here since the best way to visualize categorical interactions varies by context. Here are the options:

"point": The default. Simply plot the point estimates. You may want to use point.shape = TRUE with this and you should also consider interval = TRUE to visualize uncertainty.
"line": This connects observations across levels of the pred variable with a line. This is a good option when the pred variable is ordinal (ordered). You may still consider point.shape = TRUE and interval = TRUE is still a good idea.
"bar": A bar chart. Some call this a "dynamite plot." Many applied researchers advise against this type of plot because it does not represent the distribution of the observed data or the uncertainty of the predictions very well. It is best to at least use the interval = TRUE argument with this geom.
"boxplot": This geom plots a dot and whisker plot. These can be useful for understanding the distribution of the observed data without having to plot all the observed points (especially helpful with larger data sets). However, it is important to note the boxplots are not based on the model whatsoever.

plot.points

Logical. If TRUE, plots the actual data points as a scatterplot on top of the interaction lines. The color of the dots will be based on their moderator value.

interval

Logical. If TRUE, plots confidence/prediction intervals around the line using geom_ribbon.

pred.values

Which values of the predictor should be included in the plot? By default, all levels are included.

modx.values

For which values of the moderator should lines be plotted? Default is NULL. If NULL, then the customary +/- 1 standard deviation from the mean as well as the mean itself are used for continuous moderators. If "plus-minus", plots lines when the moderator is at +/- 1 standard deviation without the mean. You may also choose "terciles" to split the data into equally-sized groups and choose the point at the mean of each of those groups.

If the moderator is a factor variable and modx.values is NULL, each level of the factor is included. You may specify any subset of the factor levels (e.g., c("Level 1", "Level 3")) as long as there is more than 1. The levels will be plotted in the order you provide them, so this can be used to reorder levels as well.

mod2.values

For which values of the second moderator should the plot be facetted by? That is, there will be a separate plot for each level of this moderator. Defaults are the same as modx.values.

linearity.check

For two-way interactions only. If TRUE, plots a pane for each level of the moderator and superimposes a loess smoothed line (in gray) over the plot. This enables you to see if the effect is linear through the span of the moderator. See Hainmueller et al. (2016) in the references for more details on the intuition behind this. It is recommended that you also set plot.points = TRUE and use modx.values = "terciles" with this option.

facet.modx

Create separate panels for each level of the moderator? Default is FALSE, except when linearity.check is TRUE.

x.label

A character object specifying the desired x-axis label. If NULL, the variable name is used.

y.label

A character object specifying the desired x-axis label. If NULL, the variable name is used.

pred.labels

A character vector of 2 labels for the predictor if it is a 2-level factor or a continuous variable with only 2 values. If NULL, the default, the factor labels are used.

modx.labels

A character vector of labels for each level of the moderator values, provided in the same order as the modx.values argument. If NULL, the values themselves are used as labels unless modx,values is also NULL. In that case, "+1 SD" and "-1 SD" are used.

mod2.labels

A character vector of labels for each level of the 2nd moderator values, provided in the same order as the mod2.values argument. If NULL, the values themselves are used as labels unless mod2.values is also NULL. In that case, "+1 SD" and "-1 SD" are used.

main.title

A character object that will be used as an overall title for the plot. If NULL, no main title is used.

legend.main

A character object that will be used as the title that appears above the legend. If NULL, the name of the moderating variable is used.

color.class

See jtools_colors for details on the types of arguments accepted. Default is "CUD Bright" for factor moderators, "Blues" for +/- SD and user-specified modx.values values.

line.thickness

How thick should the plotted lines be? Default is 1.1; ggplot's default is 1.

vary.lty

Should the resulting plot have different shapes for each line in addition to colors? Default is NULL, which will switch to FALSE if the pred is a factor and TRUE if pred is continuous.

jitter

How much should plot.points observed values be "jittered" via ggplot2::position_jitter()? When there are many points near each other, jittering moves them a small amount to keep them from totally overlapping. In some cases, though, it can add confusion since it may make points appear to be outside the boundaries of observed values or cause other visual issues. Default is 0, but try various small values (e.g., 0.1) and increase as needed if your points are overlapping too much. If the argument is a vector with two values, then the first is assumed to be the jitter for width and the second for the height.

weights

If the data are weighted, provide a vector of weights here. This is only used if plot.points = TRUE and data is not NULL.

rug

Show a rug plot in the margins? This uses ggplot2::geom_rug() to show the distribution of the predictor (top/bottom) and/or response variable (left/right) in the original data. Default is FALSE.

rug.sides

On which sides should rug plots appear? Default is "b", meaning bottom. "t" and/or "b" show the distribution of the predictor while "l" and/or "r" show the distribution of the response. "bl" is a good option to show both the predictor and response.

force.cat

Force the predictor to be treated as if it is a factor, even if it isn't? Default is FALSE. Set to TRUE if you'd like to generate a type of plot normally reserved for categorical variables. This can be helpful for numeric variables that have a small number of unique values, for instance.

point.shape

For plotted points---either of observed data or predicted values with the "point" or "line" geoms---should the shape of the points vary by the values of the factor? This is especially useful if you aim to be black and white printing- or colorblind-friendly.

geom.alpha

What should the alpha aesthetic be for the plotted lines/bars? Default is NULL, which means it is set depending on the value of geom and plot.points.

dodge.width

What should the width argument to ggplot2::position_dodge() be? Default is NULL, which means it is set depending on the value of geom.

errorbar.width

How wide should the error bars be? Default is NULL, meaning it is set depending on the value geom. Ignored if interval is FALSE.

interval.geom

For categorical by categorical interactions. One of "errorbar" or "linerange". If the former, ggplot2::geom_errorbar() is used. If the latter, ggplot2::geom_linerange() is used.

pred.point.size

If TRUE and geom is "point" or "line", sets the size of the predicted points. Default is 3.5. Note the distinction from point.size, which refers to the observed data points.

point.size

What size should be used for observed data when plot.points is TRUE? Default is 2.

...

Ignored.

Details

This is designed to offer more flexibility than the canned functions (effect_plot(), interact_plot(), and cat_plot()), by letting you generate your own predicted data and iteratively experiment with the plotting options.

Note: predictions objects from make_predictions() store information about the arguments used to create the object. Unless you specify those arguments manually to this function, as a convenience plot_predictions will use the arguments stored in the predictions object. Those arguments are:

pred, modx, and mod2
resp
pred.values, modx.values, and mod2.values
pred.labels, modx.labels, and mod2.labels
data
interval
linearity.check
weights

Description

Usage

Arguments

Details

See Also