This function generates a forest plot with extended capabilities compared to
the default forestplot()
function in the rmeta
package. It overcomes some limitations
of the original function, including the addition of expressions, use of multiple confidence
bands per label, autosizing to viewport, and uses modern tidyverse syntax. Refer to vignette("forestplot")
for comprehensive details.
forestplot(...)# S3 method for data.frame
forestplot(x, mean, lower, upper, labeltext, is.summary, boxsize, ...)
# S3 method for default
forestplot(
labeltext,
mean,
lower,
upper,
align = NULL,
is.summary = FALSE,
graph.pos = "right",
hrzl_lines = NULL,
clip = c(-Inf, Inf),
xlab = NULL,
zero = ifelse(xlog, 1, 0),
graphwidth = "auto",
colgap = NULL,
lineheight = "auto",
line.margin = NULL,
col = fpColors(),
txt_gp = fpTxtGp(),
xlog = FALSE,
xticks = NULL,
xticks.digits = 2,
grid = FALSE,
lwd.xaxis = NULL,
lwd.zero = 1,
lwd.ci = NULL,
lty.ci = 1,
ci.vertices = NULL,
ci.vertices.height = 0.1,
boxsize = NULL,
mar = unit(rep(5, times = 4), "mm"),
title = NULL,
legend = NULL,
legend_args = fpLegend(),
new_page = getOption("forestplot_new_page", TRUE),
fn.ci_norm = fpDrawNormalCI,
fn.ci_sum = fpDrawSummaryCI,
fn.legend = NULL,
shapes_gp = fpShapesGp(),
...
)
# S3 method for gforge_forestplot
print(x, ...)
# S3 method for gforge_forestplot
plot(x, y, ..., new_page = FALSE)
# S3 method for grouped_df
forestplot(x, labeltext, mean, lower, upper, legend, is.summary, boxsize, ...)
gforge_forestplot
object
Passed on to the fn.ci_norm
and
fn.ci_sum
arguments
The gforge_forestplot
object to be printed
The name of the column if using the dplyr select syntax - defaults to "mean", else it should be a vector or a matrix with the averages. You can also provide a 2D/3D matrix that is automatically converted to the lower/upper parameters. The values should be in exponentiated form if they follow this interpretation, e.g. use exp(mean) if you have the output from a logistic regression
The lower bound of the confidence interval for the forestplot, needs to be the same format as the mean.
The upper bound of the confidence interval for the forestplot, needs to be the same format as the mean.
A list, matrix, vector or expression with the names of each
row or the name of the column if using the dplyr select syntax - defaults to "labeltext".
Note that when using group_by
a separate labeltext is not allowed.
The list should be wrapped in m x n number to resemble a matrix:
list(list("rowname 1 col 1", "rowname 2 col 1"), list("r1c2", expression(beta))
.
You can also provide a matrix although this cannot have expressions by design:
matrix(c("rowname 1 col 1", "rowname 2 col 1", "r1c2", "beta"), ncol = 2)
.
Use NA
:s for blank spaces and if you provide a full column with NA
then
that column is a empty column that adds some space. Note: If you do not
provide the mean/lower/upper arguments the function expects the label text
to be a matrix containing the labeltext in the rownames and then columns for
mean, lower, and upper.
A vector indicating by TRUE
/FALSE
if
the value is a summary value which means that it will have a different
font-style
Override the default box size based on precision
Vector giving alignment (l,r,c) for the table columns
The position of the graph element within the table of text. The
position can be 1-(ncol(labeltext) + 1)
. You can also choose set the position
to "left"
or "right"
.
Add horizontal lines to graph. Can either be TRUE
or a list
of gpar
. See line section below for details.
Lower and upper limits for clipping confidence intervals to arrows
x-axis label
x-axis coordinate for zero line. If you provide a vector of length 2 it will print a rectangle instead of just a line. If you provide NA the line is suppressed.
Width of confidence interval graph, see unit
for
details on how to utilize mm etc. The default is auto
, that is it uses up whatever
space that is left after adjusting for text size and legend
Sets the gap between columns, defaults to 6 mm but for relative widths.
Note that the value should be in unit(,"npc")
.
Height of the graph. By default this is auto
and adjusts to the
space that is left after adjusting for x-axis size and legend. Sometimes
it might be desirable to set the line height to a certain height, for
instance if you have several forestplots you may want to standardize their
line height, then you set this variable to a certain height, note this should
be provided as a unit
object. A good option
is to set the line height to unit(2, "cm")
. A third option
is to set line height to "lines" and then you get 50% more than what the
text height is as your line height
Set the margin between rows, provided in numeric or unit
form.
When having multiple confidence lines per row setting the correct
margin in order to visually separate rows
Set the colors for all the elements. See fpColors
for
details
Set the fonts etc for all text elements. See fpTxtGp
for details
If TRUE, x-axis tick marks are to follow a logarithmic scale, e.g. for
logistic regression (OR), survival estimates (HR), Poisson regression etc.
Note: This is an intentional break with the original forestplot
function as I've found that exponentiated ticks/clips/zero effect are more
difficult to for non-statisticians and there are sometimes issues with rounding
the tick marks properly.
Optional user-specified x-axis tick marks. Specify NULL to use
the defaults, numeric(0) to omit the x-axis. By adding a labels-attribute,
attr(my_ticks, "labels") <- ...
you can dictate the outputted text
at each tick. If you specify a boolean vector then ticks indicated with
FALSE wont be printed. Note that the labels have to be the same length
as the main variable.
The number of digits to allow in the x-axis if this is created by default
If you want a discrete gray dashed grid at the level of the
ticks you can set this parameter to TRUE
. If you set the parameter
to a vector of values lines will be drawn at the corresponding positions.
If you want to specify the gpar
of the lines then either
directly pass a gpar
object or set the gp attribute e.g.
attr(line_vector, "gp") <- gpar(lty = 2, col = "red")
lwd for the xaxis, see gpar
lwd for the vertical line that gives the no-effect line, see gpar
lwd for the confidence bands, see gpar
lty for the confidence bands, see gpar
Set this to TRUE if you want the ends of the confidence intervals to be shaped as a T. This is set default to TRUE if you have any other line type than 1 since there is a risk of a dash occurring at the very end, i.e. showing incorrectly narrow confidence interval.
The height hoft the vertices. Defaults to npc units corresponding to 10% of the row height. Note that the arrows correspond to the vertices heights.
A numerical vector of the form c(bottom, left, top, right)
of
the type unit
The title of the plot if any
Legend corresponding to the number of bars
The legend arguments as returned by the fpLegend
function.
If you want the plot to appear on a new blank page then set this to TRUE
, by
default it is TRUE
. If you want to change this behavior for all plots then
set the options(forestplot_new_page = FALSE)
You can specify exactly how the line with the box is
drawn for the normal (i.e. non-summary) confidence interval by changing this
parameter to your own function or some of the alternatives provided in the package.
It defaults to the box function fpDrawNormalCI
Same as previous argument but for the summary outputs
and it defaults to fpDrawSummaryCI
.
What type of function should be used for drawing the
legends, this can be a list if you want different functions. It defaults to
a box if you have anything else than a single function or the number of columns
in the mean
argument
Sets graphical parameters (squares and lines widths, styles, etc.)
of all shapes drawn (squares, lines, diamonds, etc.). This overrides col
,
lwd.xaxis
, lwd.zero
, lwd.ci
and lty.ci
.
Ignored
Multiple bands (or lines) per variable can be useful for comparing different outcomes. For instance, you may want to compare heart disease-specific survival to overall survival rates for smokers. It can be insightful to overlay two bands for this purpose. Another application could be displaying crude and adjusted estimates as separate bands.
The hrzl_lines
argument can be set as TRUE
or a list
with grid::gpar
elements.
TRUE
: A line will be added based upon the is.summary
rows. If the first line is a summary it
grid::gpar
: The same as above but the lines will be formatted according to the grid::gpar
element
list
: The list must either be numbered, i.e. list("2" = gpar(lty = 1))
, or have the same length
as the NROW(mean) + 1
. If the list is numbered the numbers should not exceed the NROW(mean) + 1
.
The no. 1 row designates the top, i.e. the line above the first row, all other correspond to
the row below. Each element in the list needs to be TRUE
, NULL
, or
gpar
element. The TRUE
defaults to a standard line, the NULL
skips a line, while gpar
corresponds to the fully customized line. Apart from
allowing standard gpar
line descriptions, lty
, lwd
, col
, and more
you can also specify gpar(columns = c(1:3, 5))
if you for instance want the line to skip a column.
The x-axis does not completely adhere to the margin.
Autosizing boxes may not always yield the best visual result; manual adjustment is recommended where possible.
xlog: Outputs the axis in log() format, but the input data should be in antilog/exp format.
col: The corresponding function in this package is fpColors
.
Max Gordon, Thomas Lumley
This version of forestplot()
enhances the standard function in the following ways:
Adding Expressions: Allows the use of expressions, such as expression(beta)
.
Multiple Bands: Enables multiple confidence bands for the same label.
Autosize: Adapts to the viewport (graph) size.
Tidyverse syntax: Utilizes convenient dplyr/tidyverse syntax for more flexible data manipulation.
vignette("forestplot")
Other forestplot functions:
fpColors()
,
fpDrawNormalCI()
,
fpLegend()
,
fpShapesGp()
,
fp_add_lines()
,
fp_decorate_graph()
,
fp_insert_row()
,
fp_set_style()
,
fp_set_zebra_style()
#############################################
# Simple examples of how to do a forestplot #
#############################################
ask <- par(ask = TRUE)
# A basic example, create some fake data
row_names <- list(list("test = 1", expression(test >= 2)))
test_data <- data.frame(
coef = c(1.59, 1.24),
low = c(1.4, 0.78),
high = c(1.8, 1.55)
)
test_data |>
forestplot(labeltext = row_names,
mean = coef,
lower = low,
upper = high,
zero = 1,
cex = 2,
lineheight = "auto",
xlab = "Lab axis txt") |>
fp_add_header("Group") |>
fp_set_style(lines = gpar(col = "darkblue"))
# Print two plots side by side using the grid
# package's layout option for viewports
fp1 <- test_data |>
forestplot(labeltext = row_names,
mean = coef,
lower = low,
upper = high,
zero = 1,
cex = 2,
lineheight = "auto",
title = "Plot 1",
xlab = "Lab axis txt")
fp2 <- test_data |>
forestplot(labeltext = row_names,
mean = coef,
lower = low,
upper = high,
zero = 1,
cex = 2,
lineheight = "auto",
xlab = "Lab axis txt",
title = "Plot 2",
new_page = FALSE)
grid.newpage()
pushViewport(viewport(layout = grid.layout(1, 2)))
pushViewport(viewport(layout.pos.col = 1))
plot(fp1)
popViewport()
pushViewport(viewport(layout.pos.col = 2))
plot(fp2)
popViewport(2)
# An advanced example
library(dplyr)
library(tidyr)
test_data <- data.frame(id = 1:4,
coef1 = c(1, 1.59, 1.3, 1.24),
coef2 = c(1, 1.7, 1.4, 1.04),
low1 = c(1, 1.3, 1.1, 0.99),
low2 = c(1, 1.6, 1.2, 0.7),
high1 = c(1, 1.94, 1.6, 1.55),
high2 = c(1, 1.8, 1.55, 1.33))
# Convert into dplyr formatted data
out_data <- test_data |>
pivot_longer(cols = everything() & -id) |>
mutate(group = gsub("(.+)([12])$", "\\2", name),
name = gsub("(.+)([12])$", "\\1", name)) |>
pivot_wider() |>
group_by(id) |>
mutate(col1 = lapply(id, \(x) ifelse(x < 4,
paste("Category", id),
expression(Category >= 4))),
col2 = lapply(1:n(), \(i) substitute(expression(bar(x) == val),
list(val = mean(coef) |> round(2)))),
col2 = if_else(id == 1,
rep("ref", n()) |> as.list(),
col2)) |>
group_by(group)
out_data |>
forestplot(mean = coef,
lower = low,
upper = high,
labeltext = c(col1, col2),
title = "Cool study",
zero = c(0.98, 1.02),
grid = structure(c(2^-.5, 2^.5),
gp = gpar(col = "steelblue", lty = 2)
),
boxsize = 0.25,
xlab = "The estimates",
new_page = TRUE,
legend = c("Treatment", "Placebo"),
legend_args = fpLegend(
pos = list("topright"),
title = "Group",
r = unit(.1, "snpc"),
gp = gpar(col = "#CCCCCC", lwd = 1.5)
)) |>
fp_set_style(box = c("royalblue", "gold"),
line = c("darkblue", "orange"),
summary = c("darkblue", "red"))
# An example of how the exponential works
data.frame(coef = c(2.45, 0.43),
low = c(1.5, 0.25),
high = c(4, 0.75),
boxsize = c(0.25, 0.25),
variables = c("Variable A", "Variable B")) |>
forestplot(labeltext = c(variables, coef),
mean = coef,
lower = low,
upper = high,
boxsize = boxsize,
zero = 1,
xlog = TRUE) |>
fp_set_style(lines = "red", box = "darkred") |>
fp_add_header(coef = "HR" |> fp_txt_plain() |> fp_align_center(),
variables = "Measurements")
# An example using style
forestplot(labeltext = cbind(Author = c("Smith et al", "Smooth et al", "Al et al")),
mean = cbind(1:3, 1.5:3.5),
lower = cbind(0:2, 0.5:2.5),
upper = cbind(4:6, 5.5:7.5),
is.summary = c(FALSE, FALSE, TRUE),
vertices = TRUE) |>
fp_set_style(default = gpar(lineend = "square", linejoin = "mitre", lwd = 3, col = "pink"),
box = gpar(fill = "black", col = "red"), # only one parameter
lines = list( # as many parameters as CI
gpar(lwd = 10), gpar(lwd = 5),
gpar(), gpar(),
gpar(lwd = 2), gpar(lwd = 1)
),
summary = list( # as many parameters as band per label
gpar(fill = "violet", col = "gray", lwd = 10),
gpar(fill = "orange", col = "gray", lwd = 10)
))
par(ask = ask)
# See vignette for a more detailed description
# vignette("forestplot", package="forestplot")
Run the code above in your browser using DataLab