sjp.glm: Plot estimates, predictions or effects of generalized linear models

Description

Plot odds or incident rate ratios with confidence intervalls as dot plot. Depending on the type argument, this function may also plot model assumptions for generalized linear models, or marginal effects (predicted probabilities or events).

Usage

sjp.glm(fit, type = "dots", vars = NULL, group.estimates = NULL,
  remove.estimates = NULL, sort.est = TRUE, title = NULL,
  legend.title = NULL, axis.labels = NULL, axis.title = NULL,
  geom.size = NULL, geom.colors = "Set1", wrap.title = 50,
  wrap.labels = 25, axis.lim = NULL, grid.breaks = 0.5,
  trns.ticks = TRUE, show.intercept = FALSE, show.values = TRUE,
  show.p = TRUE, show.ci = FALSE, show.legend = FALSE,
  show.summary = FALSE, show.scatter = TRUE, point.alpha = 0.2,
  point.color = NULL, jitter.ci = FALSE, digits = 2, vline.type = 2,
  vline.color = "grey70", coord.flip = TRUE, y.offset = 0.15,
  facet.grid = TRUE, prnt.plot = TRUE, ...)

Arguments

fit

Fitted generalized linear model (glm- or logistf-object).

type

Type of plot. Use one of following:

"dots": (or "glm" or "or" (default)) for odds or incident rate ratios (forest plot). Note that this type plots the exponentiated estimates, thus being appropriate only for specific link-functions.
"slope": to plot probability or incidents slopes (predicted probabilities or incidents) for each model term, where all remaining co-variates are set to zero (i.e. ignored). Use facet.grid to decide whether to plot each coefficient as separate plot or as integrated faceted plot.
"eff": to plot marginal effects of predicted probabilities or incidents for each model term, where all remaining co-variates are set to the mean (see 'Details'). Use facet.grid to decide whether to plot each coefficient as separate plot or as integrated faceted plot.
"pred": to plot predicted values for the response, related to specific model predictors. See 'Details'.
"ma": to check model assumptions. Note that the only relevant argument for this option is fit. All other arguments are ignored.
"vif": to plot Variance Inflation Factors.

vars

Numeric vector with column indices of selected variables or a character vector with variable names of selected variables from the fitted model, which should be used to plot - depending on type - estimates, fixed effects slopes or predicted values (mean, probabilities, incidents rates, ...). See 'Examples'.

group.estimates

Numeric or character vector, indicating a group identifier for each estimate. Dots and confidence intervals of estimates are coloured according to their group association. See 'Examples'.

remove.estimates

Character vector with coefficient names that indicate which estimates should be removed from the plot. remove.estimates = "est_name" would remove the estimate est_name. Default is NULL, i.e. all estimates are printed.

sort.est

Logical, determines whether estimates should be sorted according to their values. If group.estimates is not NULL, estimates are sorted according to their group assignment.

title

character vector, used as plot title. Depending on plot type and function, will be set automatically. If title = "", no title is printed. For effect-plots, may also be a character vector of length > 1, to define titles for each sub-plot or facet.

legend.title

Character vector, used as title for the plot legend. Note that only some plot types have legends (e.g. type = "pred" or when grouping estimates with group.estimates).

axis.labels

character vector with labels used as axis labels. Optional argument, since in most cases, axis labels are set automatically.

axis.title

Character vector of length one or two (depending on the plot function and type), used as title(s) for the x and y axis. If not specified, a default labelling is chosen. To set multiple axis titles (e.g. with type = "eff" for many predictors), axis.title must be a character vector of same length of plots that are printed. In this case, each plot gets an own axis title (applying, for instance, to the y-axis for type = "eff"). Note: Some plot types do not support this argument. In such cases, use the return value and add axis titles manually with labs, e.g.: $plot.list[[1]] + labs(x = ...)

geom.size

size resp. width of the geoms (bar width, line thickness or point size, depending on plot type and function). Note that bar and bin widths mostly need smaller values than dot sizes.

geom.colors

User defined color palette for geoms. If group.estimates is not specified, must either be vector with two color values or a specific color palette code (see 'Details' in sjp.grpfrq). Else, if group.estimates is specified, geom.colors must be a vector of same length as groups. See 'Examples'.

wrap.title

numeric, determines how many chars of the plot title are displayed in one line and when a line break is inserted.

wrap.labels

numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted.

axis.lim

Numeric vector of length 2, defining the range of the plot axis. Depending on plot type, may effect either x- or y-axis, or both. For multiple plot outputs (e.g., from type = "eff" or type = "slope" in sjp.glm), axis.lim may also be a list of vectors of length 2, defining axis limits for each plot (only if non-faceted).

grid.breaks

numeric; sets the distance between breaks for the axis, i.e. at every grid.breaks'th position a major grid is being printed.

trns.ticks

Logical, if TRUE, the grid lines have exponential distances (equidistant), i.e. they visually have the same distance from one panel grid to the next. If FALSE, grids are plotted on every grid.breaks's position, thus the grid lines become narrower with higher odds ratio values.

show.intercept

Logical, if TRUE, the intercept of the fitted model is also plotted. Default is FALSE. For glm's, please note that due to exponential transformation of estimates, the intercept in some cases can not be calculated, thus the function call is interrupted and no plot printed.

show.values

Logical, whether values should be plotted or not.

show.p

Logical, adds significance levels to values, or value and variable labels.

show.ci

Logical, if TRUE, depending on type, a confidence interval or region is added to the plot. For frequency plots, the confidence interval for the relative frequencies are shown.

show.legend

logical, if TRUE, and depending on plot type and function, a legend is added to the plot.

show.summary

Logical, if TRUE, a summary with model statistics is added to the plot.

show.scatter

Logical, if TRUE (default), adds a scatter plot of data points to the plot. Only applies for slope-type or predictions plots. For most plot types, dots are jittered to avoid overplotting, hence the points don't reflect exact values in the data.

point.alpha

Alpha value of point-geoms in the scatter plots. Only applies, if show.scatter = TRUE.

point.color

Color of of point-geoms in the scatter plots. Only applies, if show.scatter = TRUE.

jitter.ci

Logical, if TRUE and show.ci = TRUE and confidence bands are displayed as error bars, adds jittering to lines and error bars to avoid overlapping.

digits

Numeric, amount of digits after decimal point when rounding estimates and values.

vline.type

Linetype of the vertical "zero point" line. Default is 2 (dashed line).

vline.color

Color of the vertical "zero point" line. Default value is "grey70".

coord.flip

logical, if TRUE, the x and y axis are swapped.

y.offset

numeric, offset for text labels when their alignment is adjusted to the top/bottom of the geom (see hjust and vjust).

facet.grid

TRUE to arrange the lay out of of multiple plots in a grid of an integrated single plot. This argument calls facet_wrap or facet_grid to arrange plots. Use plot_grid to plot multiple plot-objects as an arranged grid with grid.arrange.

prnt.plot

logical, if TRUE (default), plots the results as graph. Use FALSE if you don't want to plot any graphs. In either case, the ggplot-object will be returned as value.

...

Other arguments passed down to further functions. Currently, following arguments are supported:

?effects::effect: Any arguments accepted by the effect resp. allEffects function, for type = "eff".
width: The width-argument for error bars.
alpha: The alpha-argument for confidence bands.
level: The level-argument confidence bands.

Value

(Insisibily) returns, depending on the plot type

The ggplot-object (plot). For multiple plots and if facet.grid = FALSE) a plot.list is returned.
A data frame data with the data used to build the ggplot-object(s), or a list of data frames (data.list).

Details

type = "slope": the predicted values are based on the intercept's estimate and each specific term's estimate. All other co-variates are set to zero (i.e. ignored), which corresponds to family(fit)$linkinv(eta = b0 + bi * xi) (where xi is the estimate). This plot type can be seen as equivalent to type = "slope" for sjp.lm, just for glm objects. This plot type may give similar results as type = "pred", however, type = "slope" does not adjust for other predictors.
type = "eff": computes marginal effects of all higher order terms in the model. The predicted values computed by type = "eff" are adjusted for all other co-variates, by setting them to the mean (as returned by the allEffects function). You can pass further arguments down to allEffects for flexible function call via the ...-argument.
type = "pred": the predicted values of the response are computed, based on the predict.glm method. Corresponds to predict(fit, type = "response"). This plot type requires the vars argument to select specific terms that should be used for the x-axis and - optional - as grouping factor. Hence, vars must be a character vector with the names of one or two model predictors. See 'Examples'.

Examples

Run this code

# NOT RUN {
# prepare dichotomous dependent variable
swiss$y <- ifelse(swiss$Fertility < median(swiss$Fertility), 0, 1)

# fit model
fitOR <- glm(y ~ Education + Examination + Infant.Mortality + Catholic,
             family = binomial(link = "logit"), data = swiss)

# print Odds Ratios as dots
sjp.glm(fitOR)

# -------------------------------
# Predictors for negative impact of care. Data from
# the EUROFAMCARE sample dataset
# -------------------------------
library(sjmisc)
library(sjlabelled)
data(efc)
# create binary response
y <- ifelse(efc$neg_c_7 < median(na.omit(efc$neg_c_7)), 0, 1)
# create data frame for fitted model
mydf <- data.frame(y = as.factor(y),
                   sex = to_factor(efc$c161sex),
                   dep = to_factor(efc$e42dep),
                   barthel = efc$barthtot,
                   education = to_factor(efc$c172code))
# fit model
fit <- glm(y ~., data = mydf, family = binomial(link = "logit"))

# plot odds ratios
sjp.glm(fit, title = get_label(efc$neg_c_7))

# plot probability curves (relationship between predictors and response)
sjp.glm(fit, title = get_label(efc$neg_c_7), type = "slope")

# --------------------------
# grouping estimates
# --------------------------
sjp.glm(fit,  group.estimates = c(1, 2, 2, 2, 3, 4, 4))

# --------------------------
# model predictions, with selected model terms.
# 'vars' needs to be a character vector of length 1 to 3
# with names of model terms for x-axis and grouping factor.
# --------------------------
sjp.glm(fit, type = "pred", vars = "barthel")
# faceted, with ci
sjp.glm(fit, type = "pred", vars = c("barthel", "dep"), show.ci = TRUE)
# w/o facets
sjp.glm(fit, type = "pred", vars = c("barthel", "dep"), facet.grid = FALSE)
# with third grouping variable - this type automatically uses grid layout
sjp.glm(fit, type = "pred", vars = c("barthel", "sex", "education"))

# }