forestplot: Draws a forest plot

Description

This function generates a forest plot with extended capabilities compared to the default forestplot() function in the rmeta package. It overcomes some limitations of the original function, including the addition of expressions, use of multiple confidence bands per label, autosizing to viewport, and uses modern tidyverse syntax. Refer to vignette("forestplot") for comprehensive details.

Usage

forestplot(...)
# S3 method for data.frame
forestplot(x, mean, lower, upper, labeltext, is.summary, boxsize, ...)
# S3 method for default
forestplot(
  labeltext,
  mean,
  lower,
  upper,
  align = NULL,
  is.summary = FALSE,
  graph.pos = "right",
  hrzl_lines = NULL,
  clip = c(-Inf, Inf),
  xlab = NULL,
  zero = ifelse(xlog, 1, 0),
  graphwidth = "auto",
  colgap = NULL,
  lineheight = "auto",
  line.margin = NULL,
  col = fpColors(),
  txt_gp = fpTxtGp(),
  xlog = FALSE,
  xticks = NULL,
  xticks.digits = 2,
  grid = FALSE,
  lwd.xaxis = NULL,
  lwd.zero = 1,
  lwd.ci = NULL,
  lty.ci = 1,
  ci.vertices = NULL,
  ci.vertices.height = 0.1,
  boxsize = NULL,
  mar = unit(rep(5, times = 4), "mm"),
  title = NULL,
  legend = NULL,
  legend_args = fpLegend(),
  new_page = getOption("forestplot_new_page", TRUE),
  fn.ci_norm = fpDrawNormalCI,
  fn.ci_sum = fpDrawSummaryCI,
  fn.legend = NULL,
  shapes_gp = fpShapesGp(),
  ...
)
# S3 method for gforge_forestplot
print(x, ...)
# S3 method for gforge_forestplot
plot(x, y, ..., new_page = FALSE)
# S3 method for grouped_df
forestplot(x, labeltext, mean, lower, upper, legend, is.summary, boxsize, ...)

Value

gforge_forestplot object

Arguments

...: Passed on to the fn.ci_norm and fn.ci_sum arguments
x: The gforge_forestplot object to be printed
mean: The name of the column if using the dplyr select syntax - defaults to "mean", else it should be a vector or a matrix with the averages. You can also provide a 2D/3D matrix that is automatically converted to the lower/upper parameters. The values should be in exponentiated form if they follow this interpretation, e.g. use exp(mean) if you have the output from a logistic regression
lower: The lower bound of the confidence interval for the forestplot, needs to be the same format as the mean.
upper: The upper bound of the confidence interval for the forestplot, needs to be the same format as the mean.
labeltext: A list, matrix, vector or expression with the names of each row or the name of the column if using the dplyr select syntax - defaults to "labeltext". Note that when using group_by a separate labeltext is not allowed. The list should be wrapped in m x n number to resemble a matrix: list(list("rowname 1 col 1", "rowname 2 col 1"), list("r1c2", expression(beta)). You can also provide a matrix although this cannot have expressions by design: matrix(c("rowname 1 col 1", "rowname 2 col 1", "r1c2", "beta"), ncol = 2). Use NA:s for blank spaces and if you provide a full column with NA then that column is a empty column that adds some space. Note: If you do not provide the mean/lower/upper arguments the function expects the label text to be a matrix containing the labeltext in the rownames and then columns for mean, lower, and upper.
is.summary: A vector indicating by TRUE/FALSE if the value is a summary value which means that it will have a different font-style
boxsize: Override the default box size based on precision
align: Vector giving alignment (l,r,c) for the table columns
graph.pos: The position of the graph element within the table of text. The position can be 1-(ncol(labeltext) + 1). You can also choose set the position to "left" or "right".
hrzl_lines: Add horizontal lines to graph. Can either be TRUE or a list of gpar. See line section below for details.
clip: Lower and upper limits for clipping confidence intervals to arrows
xlab: x-axis label
zero: x-axis coordinate for zero line. If you provide a vector of length 2 it will print a rectangle instead of just a line. If you provide NA the line is suppressed.
graphwidth: Width of confidence interval graph, see unit for details on how to utilize mm etc. The default is auto, that is it uses up whatever space that is left after adjusting for text size and legend
colgap: Sets the gap between columns, defaults to 6 mm but for relative widths. Note that the value should be in unit(,"npc").
lineheight: Height of the graph. By default this is auto and adjusts to the space that is left after adjusting for x-axis size and legend. Sometimes it might be desirable to set the line height to a certain height, for instance if you have several forestplots you may want to standardize their line height, then you set this variable to a certain height, note this should be provided as a unit object. A good option is to set the line height to unit(2, "cm"). A third option is to set line height to "lines" and then you get 50% more than what the text height is as your line height
line.margin: Set the margin between rows, provided in numeric or unit form. When having multiple confidence lines per row setting the correct margin in order to visually separate rows
col: Set the colors for all the elements. See fpColors for details
txt_gp: Set the fonts etc for all text elements. See fpTxtGp for details
xlog: If TRUE, x-axis tick marks are to follow a logarithmic scale, e.g. for logistic regression (OR), survival estimates (HR), Poisson regression etc. Note: This is an intentional break with the original forestplot function as I've found that exponentiated ticks/clips/zero effect are more difficult to for non-statisticians and there are sometimes issues with rounding the tick marks properly.
xticks: Optional user-specified x-axis tick marks. Specify NULL to use the defaults, numeric(0) to omit the x-axis. By adding a labels-attribute, attr(my_ticks, "labels") <- ... you can dictate the outputted text at each tick. If you specify a boolean vector then ticks indicated with FALSE wont be printed. Note that the labels have to be the same length as the main variable.
xticks.digits: The number of digits to allow in the x-axis if this is created by default
grid: If you want a discrete gray dashed grid at the level of the ticks you can set this parameter to TRUE. If you set the parameter to a vector of values lines will be drawn at the corresponding positions. If you want to specify the gpar of the lines then either directly pass a gpar object or set the gp attribute e.g. attr(line_vector, "gp") <- gpar(lty = 2, col = "red")
lwd.xaxis: lwd for the xaxis, see gpar
lwd.zero: lwd for the vertical line that gives the no-effect line, see gpar
lwd.ci: lwd for the confidence bands, see gpar
lty.ci: lty for the confidence bands, see gpar
ci.vertices: Set this to TRUE if you want the ends of the confidence intervals to be shaped as a T. This is set default to TRUE if you have any other line type than 1 since there is a risk of a dash occurring at the very end, i.e. showing incorrectly narrow confidence interval.
ci.vertices.height: The height hoft the vertices. Defaults to npc units corresponding to 10% of the row height. Note that the arrows correspond to the vertices heights.
mar: A numerical vector of the form c(bottom, left, top, right) of the type unit
title: The title of the plot if any
legend: Legend corresponding to the number of bars
legend_args: The legend arguments as returned by the fpLegend function.
new_page: If you want the plot to appear on a new blank page then set this to TRUE, by default it is TRUE. If you want to change this behavior for all plots then set the options(forestplot_new_page = FALSE)
fn.ci_norm: You can specify exactly how the line with the box is drawn for the normal (i.e. non-summary) confidence interval by changing this parameter to your own function or some of the alternatives provided in the package. It defaults to the box function fpDrawNormalCI
fn.ci_sum: Same as previous argument but for the summary outputs and it defaults to fpDrawSummaryCI.
fn.legend: What type of function should be used for drawing the legends, this can be a list if you want different functions. It defaults to a box if you have anything else than a single function or the number of columns in the mean argument
shapes_gp: Sets graphical parameters (squares and lines widths, styles, etc.) of all shapes drawn (squares, lines, diamonds, etc.). This overrides col, lwd.xaxis, lwd.zero, lwd.ci and lty.ci.
y: Ignored

Multiple bands

Multiple bands (or lines) per variable can be useful for comparing different outcomes. For instance, you may want to compare heart disease-specific survival to overall survival rates for smokers. It can be insightful to overlay two bands for this purpose. Another application could be displaying crude and adjusted estimates as separate bands.

Horizontal lines

The hrzl_lines argument can be set as TRUE or a list with grid::gpar elements.

TRUE: A line will be added based upon the is.summary rows. If the first line is a summary it
grid::gpar: The same as above but the lines will be formatted according to the grid::gpar element
list: The list must either be numbered, i.e. list("2" = gpar(lty = 1)), or have the same length as the NROW(mean) + 1. If the list is numbered the numbers should not exceed the NROW(mean) + 1. The no. 1 row designates the top, i.e. the line above the first row, all other correspond to the row below. Each element in the list needs to be TRUE, NULL, or gpar element. The TRUE defaults to a standard line, the NULL skips a line, while gpar corresponds to the fully customized line. Apart from allowing standard gpar line descriptions, lty, lwd, col, and more you can also specify gpar(columns = c(1:3, 5)) if you for instance want the line to skip a column.

Known Issues

The x-axis does not completely adhere to the margin.
Autosizing boxes may not always yield the best visual result; manual adjustment is recommended where possible.

API Changes from <code>rmeta</code> package's <code>forestplot</code>

xlog: Outputs the axis in log() format, but the input data should be in antilog/exp format.
col: The corresponding function in this package is fpColors.

Author

Max Gordon, Thomas Lumley

Details

This version of forestplot() enhances the standard function in the following ways:

Adding Expressions: Allows the use of expressions, such as expression(beta).
Multiple Bands: Enables multiple confidence bands for the same label.
Autosize: Adapts to the viewport (graph) size.
Tidyverse syntax: Utilizes convenient dplyr/tidyverse syntax for more flexible data manipulation.

Examples

Run this code

#############################################
# Simple examples of how to do a forestplot #
#############################################

ask <- par(ask = TRUE)

# A basic example, create some fake data
row_names <- list(list("test = 1", expression(test >= 2)))
test_data <- data.frame(
  coef = c(1.59, 1.24),
  low = c(1.4, 0.78),
  high = c(1.8, 1.55)
)
test_data |>
  forestplot(labeltext = row_names,
             mean = coef,
             lower = low,
             upper = high,
             zero = 1,
             cex  = 2,
             lineheight = "auto",
             xlab = "Lab axis txt") |>
  fp_add_header("Group") |>
  fp_set_style(lines = gpar(col = "darkblue"))

# Print two plots side by side using the grid
# package's layout option for viewports
fp1 <- test_data |>
  forestplot(labeltext = row_names,
             mean = coef,
             lower = low,
             upper = high,
             zero = 1,
             cex  = 2,
             lineheight = "auto",
             title = "Plot 1",
             xlab = "Lab axis txt")
fp2 <- test_data |>
  forestplot(labeltext = row_names,
             mean = coef,
             lower = low,
             upper = high,
             zero = 1,
             cex  = 2,
             lineheight = "auto",
             xlab = "Lab axis txt",
             title = "Plot 2",
             new_page = FALSE)

grid.newpage()
pushViewport(viewport(layout = grid.layout(1, 2)))
pushViewport(viewport(layout.pos.col = 1))
plot(fp1)
popViewport()
pushViewport(viewport(layout.pos.col = 2))
plot(fp2)
popViewport(2)

# An advanced example
library(dplyr)
library(tidyr)
test_data <- data.frame(id = 1:4,
                        coef1 = c(1, 1.59, 1.3, 1.24),
                        coef2 = c(1, 1.7, 1.4, 1.04),
                        low1 = c(1, 1.3, 1.1, 0.99),
                        low2 = c(1, 1.6, 1.2, 0.7),
                        high1 = c(1, 1.94, 1.6, 1.55),
                        high2 = c(1, 1.8, 1.55, 1.33))

# Convert into dplyr formatted data
out_data <- test_data |>
  pivot_longer(cols = everything() & -id) |>
  mutate(group = gsub("(.+)([12])$", "\\2", name),
         name = gsub("(.+)([12])$", "\\1", name)) |>
  pivot_wider() |>
  group_by(id) |>
  mutate(col1 = lapply(id, \(x) ifelse(x < 4,
                                       paste("Category", id),
                                       expression(Category >= 4))),
         col2 = lapply(1:n(), \(i) substitute(expression(bar(x) == val),
                                              list(val = mean(coef) |> round(2)))),
         col2 = if_else(id == 1,
                        rep("ref", n()) |> as.list(),
                        col2)) |>
  group_by(group)

out_data |>
  forestplot(mean = coef,
             lower = low,
             upper = high,
             labeltext = c(col1, col2),
             title = "Cool study",
             zero = c(0.98, 1.02),
             grid = structure(c(2^-.5, 2^.5),
                              gp = gpar(col = "steelblue", lty = 2)
             ),
             boxsize = 0.25,
             xlab = "The estimates",
             new_page = TRUE,
             legend = c("Treatment", "Placebo"),
             legend_args = fpLegend(
               pos = list("topright"),
               title = "Group",
               r = unit(.1, "snpc"),
               gp = gpar(col = "#CCCCCC", lwd = 1.5)
             )) |>
  fp_set_style(box = c("royalblue", "gold"),
               line = c("darkblue", "orange"),
               summary = c("darkblue", "red"))

# An example of how the exponential works
data.frame(coef = c(2.45, 0.43),
           low = c(1.5, 0.25),
           high = c(4, 0.75),
           boxsize = c(0.25, 0.25),
           variables = c("Variable A", "Variable B")) |>
  forestplot(labeltext = c(variables, coef),
             mean = coef,
             lower = low,
             upper = high,
             boxsize = boxsize,
             zero = 1,
             xlog = TRUE) |>
  fp_set_style(lines = "red", box = "darkred") |>
  fp_add_header(coef = "HR" |> fp_txt_plain() |> fp_align_center(),
                variables = "Measurements")

# An example using style
forestplot(labeltext = cbind(Author = c("Smith et al", "Smooth et al", "Al et al")),
           mean = cbind(1:3, 1.5:3.5),
           lower = cbind(0:2, 0.5:2.5),
           upper = cbind(4:6, 5.5:7.5),
           is.summary = c(FALSE, FALSE, TRUE),
           vertices = TRUE) |>
  fp_set_style(default = gpar(lineend = "square", linejoin = "mitre", lwd = 3, col = "pink"),
               box = gpar(fill = "black", col = "red"), # only one parameter
               lines = list( # as many parameters as CI
                 gpar(lwd = 10), gpar(lwd = 5),
                 gpar(), gpar(),
                 gpar(lwd = 2), gpar(lwd = 1)
               ),
               summary = list( # as many parameters as band per label
                 gpar(fill = "violet", col = "gray", lwd = 10),
                 gpar(fill = "orange", col = "gray", lwd = 10)
               ))

par(ask = ask)
# See vignette for a more detailed description
# vignette("forestplot",  package="forestplot")

Run the code above in your browser using DataLab