Learn R Programming

⚠️There's a newer version (0.6.0) of this package.Take me there.

ggpmisc

Purpose

Package ‘ggpmisc’ (Miscellaneous Extensions to ‘ggplot2’) is a set of extensions to R package ‘ggplot2’ (>= 3.0.0) with emphasis on annotations and highlighting related to fitted models and data summaries. Data summaries shown as text, tables or equations are implemented. New geoms support insets in ggplots. The grammar of graphics is extended to support native plot coordinates (npc) so that annotations can be easily positioned using special geometries and scales. New position functions facilitate the labeling of observations by nudging data labels away or towards curves or a focal virtual center.

Extended Grammar of graphics

The position of annotations within the plotting area depends in most cases on graphic design considerations rather than on properties such as the range of values in the data being plotted. In particular, the location within the plotting area of large annotation objects like model-fit summaries, location maps, plots, and tables needs usually to be set independently of the x and y scales, re-scaling or any transformations. To acknowledge this, the Grammar of Graphics is here expanded by supporting x and y positions expressed in ‘grid’ “npc” units in the range 0..1. This is implemented with new (pseudo-)aesthetics npcx and npcy and their corresponding scales, plus geometries and a revised annotate() function. The new aesthetics function in “parallel” with the x and y aesthetics used for plotting data. The advantage of this approach is that the syntax used for annotations becomes identical to that used for plotting data and that annotations with approach cleanly support facets in a way consistent with the rest of the grammar.

Aesthetics and scales

Scales scale_npcx_continuous() and scale_npcy_continuous() and the corresponding new aesthetics npcx and npcy make it possible to add graphic elements and text to plots using coordinates expressed in npc units for the location within the plotting area.

Scales scale_x_logFC() and scale_y_logFC() are suitable for plotting of log fold change data. Scales scale_x_Pvalue(), scale_y_Pvalue(), scale_x_FDR() and scale_y_FDR() are suitable for plotting p-values and adjusted p-values or false discovery rate (FDR). Default arguments are suitable for volcano and quadrant plots as used for transcriptomics, metabolomics and similar data.

Scales scale_colour_outcome(), scale_fill_outcome() and scale_shape_outcome() and functions outome2factor(), threshold2factor(), xy_outcomes2factor() and xy_thresholds2factor() used together make it easy to map ternary numeric outputs and logical binary outcomes to color, fill and shape aesthetics. Default arguments are suitable for volcano, quadrant and other plots as used for genomics, metabolomics and similar data.

Geometries

Geometries geom_table(), geom_plot() and geom_grob() make it possible to add inset tables, inset plots, and arbitrary ‘grid’ graphical objects including bitmaps and vector graphics as layers to a ggplot using native coordinates for x and y.

Geometries geom_text_npc(), geom_label_npc(), geom_table_npc(), geom_plot_npc() and geom_grob_npc(), geom_text_npc() and geom_label_npc() are versions of geometries that accept positions on x and y axes using aesthetics npcx and npcy values expressed in “npc” units.

Geometries geom_x_margin_arrow(), geom_y_margin_arrow(), geom_x_margin_grob(), geom_y_margin_grob(), geom_x_margin_point() and geom_y_margin_point() make it possible to add marks along the x and y axes. geom_vhlines() and geom_quadrant_lines() draw vertical and horizontal reference lines within a single layer.

Geometry geom_linked_text() connects text drawn at a nudged position to the original position, usually that of a point being labelled.

Statistics

Statistic stat_fmt_tb() helps with the formatting of tables to be plotted with geom_table().

Statistics stat_peaks() and stat_valleys() can be used to highlight and/or label maxima and minima in a plot.

Statistics that help with reporting the results of model fits are stat_poly_eq(), stat_fit_residuals(), stat_fit_deviations(), stat_fit_glance(), stat_fit_augment(), stat_fit_tidy() and stat_fit_tb().

Four statistics, stat_dens2d_filter(), stat_dens2d_label(), stat_dens1d_filter() and stat_dens1d_label(), implement tagging or selective labeling of observations based on the local 2D density of observations in a panel. Another two statistics, stat_dens1d_filter_g() and stat_dens1d_filter_g() compute the density by group instead of by plot panel. These six stats are designed to work well together with geom_text_repel() and geom_label_repel() from package ‘ggrepel’.

A summary statistic using special grouping for quadrants stat_quadrant_counts() can be used to automate labeling with the number of observations.

The statistics stat_apply_panel() and stat_apply_group() can be useful for applying arbitrary functions returning numeric vectors. They are specially useful with functions lime cumsum(), cummax() and diff().

Position functions

New position functions implementing different flavours of nudging are provided: position_nudge_keep(), position_nudge_to(), position_nudge_center() and position_nudge_line(). These last two functions make it possible to apply nudging that varies automatically according to the relative position of points with respect to arbitrary points or lines, or with respect to a polynomial or smoothing spline fitted on-the-fly to the the observations. In contrast to ggplot2::position_nudge() all these functions return the repositioned and original x and y coordinates.

ggplot methods

Being ggplot() defined as a generic method in ‘ggplot2’ makes it possible to define specializations, and we provide two for time series stored in objects of classes ts and xts which automatically convert these objects into tibbles and set by default the aesthetic mappings for x and y automatically. A companion function try_tibble() is also exported.

MIGRATED

Functions for the manipulation of layers in ggplot objects, together with statistics and geometries useful for debugging extensions to package ‘ggplot2’, earlier included in this package are now in package ‘gginnards’.

Examples

library(ggpmisc)
library(ggrepel)

In the first example we plot a time series using the specialized version of ggplot() that converts the time series into a tibble and maps the x and y aesthetics automatically. We also highlight and label the peaks using stat_peaks.

ggplot(lynx, as.numeric = FALSE) + geom_line() + 
  stat_peaks(colour = "red") +
  stat_peaks(geom = "text", colour = "red", angle = 66,
             hjust = -0.1, x.label.fmt = "%Y") +
  stat_peaks(geom = "rug", colour = "red", sides = "b") +
  expand_limits(y = 8000)

In the second example we add the equation for a fitted polynomial plus the adjusted coefficient of determination to a plot showing the observations plus the fitted curve, deviations and confidence band. We use stat_poly_eq().

formula <- y ~ x + I(x^2)
ggplot(cars, aes(speed, dist)) +
  geom_point() +
  stat_fit_deviations(method = "lm", formula = formula, colour = "red") +
  geom_smooth(method = "lm", formula = formula) +
  stat_poly_eq(aes(label =  paste(stat(eq.label), stat(adj.rr.label), sep = "*\", \"*")),
               formula = formula, parse = TRUE)

The same figure as in the second example but this time annotated with the ANOVA table for the model fit. We use stat_fit_tb() which can be used to add ANOVA or summary tables.

formula <- y ~ x + I(x^2)
ggplot(cars, aes(speed, dist)) +
  geom_point() +
  geom_smooth(method = "lm", formula = formula) +
  stat_fit_tb(method = "lm",
              method.args = list(formula = formula),
              tb.type = "fit.anova",
              tb.vars = c(Effect = "term", 
                          "df",
                          "M.S." = "meansq", 
                          "italic(F)" = "statistic", 
                          "italic(P)" = "p.value"),
              tb.params = c(x = 1, "x^2" = 2),
              label.y.npc = "top", label.x.npc = "left",
              size = 2.5,
              parse = TRUE)
#> Warning: Computation failed in `stat_fit_tb()`:
#> no applicable method for 'tidy' applied to an object of class "c('anova', 'data.frame')"

A plot with an inset plot.

p <- ggplot(mtcars, aes(factor(cyl), mpg, colour = factor(cyl))) +
  stat_boxplot() +
  labs(y = NULL) +
  theme_bw(9) + theme(legend.position = "none")

ggplot(mtcars, aes(wt, mpg, colour = factor(cyl))) +
  geom_point() +
  annotate("plot_npc", npcx = "left", npcy = "bottom", label = p) +
  expand_limits(y = 0, x = 0)

A quadrant plot with counts and labels, using geom_text_repel() from package ‘ggrepel’.

ggplot(quadrant_example.df, aes(logFC.x, logFC.y)) +
  geom_point(alpha = 0.3) +
  geom_quadrant_lines() +
  stat_quadrant_counts() +
  stat_dens2d_filter(color = "red", keep.fraction = 0.02) +
  stat_dens2d_labels(aes(label = gene), keep.fraction = 0.02, 
                     geom = "text_repel", size = 2, colour = "red") +
  scale_x_logFC(name = "Transcript abundance after A%unit") +
  scale_y_logFC(name = "Transcript abundance after B%unit")

Installation

Installation of the most recent stable version from CRAN:

install.packages("ggpmisc")

Installation of the current unstable version from GitHub:

# install.packages("devtools")
devtools::install_github("aphalo/ggpmisc")

Documentation

HTML documentation is available at (https://docs.r4photobiology.info/ggpmisc/), including a User Guide.

News about updates are regularly posted at (https://www.r4photobiology.info/).

Contributing

Please report bugs and request new features at (https://github.com/aphalo/ggpmisc/issues). Pull requests are welcome at (https://github.com/aphalo/ggpmisc).

Citation

If you use this package to produce scientific or commercial publications, please cite according to:

citation("ggpmisc")
#> 
#> To cite package 'ggpmisc' in publications use:
#> 
#>   Pedro J. Aphalo (2021). ggpmisc: Miscellaneous Extensions to
#>   'ggplot2'. https://docs.r4photobiology.info/ggpmisc/,
#>   https://github.com/aphalo/ggpmisc.
#> 
#> A BibTeX entry for LaTeX users is
#> 
#>   @Manual{,
#>     title = {ggpmisc: Miscellaneous Extensions to 'ggplot2'},
#>     author = {Pedro J. Aphalo},
#>     year = {2021},
#>     note = {https://docs.r4photobiology.info/ggpmisc/,
#> https://github.com/aphalo/ggpmisc},
#>   }

License

© 2016-2021 Pedro J. Aphalo (pedro.aphalo@helsinki.fi). Released under the GPL, version 2 or greater. This software carries no warranty of any kind.

Copy Link

Version

Install

install.packages('ggpmisc')

Monthly Downloads

15,464

Version

0.3.9

License

GPL (>= 2)

Issues

Pull Requests

Stars

Forks

Maintainer

Pedro Aphalo

Last Published

April 4th, 2021

Functions in ggpmisc (0.3.9)

geom_quadrant_lines

Reference lines: horizontal plus vertical, and quadrants
FC_name

Fold change- axis labels
annotate

Annotations supporting NPC
geom_table

Inset tables
geom_linked_text

Linked Text
geom_grob

Inset graphical objects
geom_plot

Inset plots
FC_format

Formatter for fold change tick labels
find_peaks

Find local maxima or global maximum (peaks)
compute_npcx

Compute npc coordinates
ggpmisc-package

ggpmisc: Miscellaneous Extensions to 'ggplot2'
outcome2factor

Convert numeric ternary outcomes into a factor
ggplot

Create a new ggplot plot from time series data
scale_y_Pvalue

Covenience scale for P-values
stat_apply_group

Apply a function to x or y values
position_nudge_center

Nudge labels away from a central point
geom_label_npc

Text with Normalised Parent Coordinates
scale_continuous_npc

Position scales for continuous data (npcx & npcy)
geom_x_margin_arrow

Reference arrows on the margins
scale_colour_outcome

Colour and fill scales for ternary outcomes
geom_x_margin_point

Reference points on the margins
geom_x_margin_grob

Add Grobs on the margins
position_nudge_to

Nudge labels to new positions
stat_fit_tidy

One row data frame with fitted parameter estimates
grob_draw_panel_fun

Stat* Objects
stat_fit_tb

Model-fit summary or ANOVA
position_nudge_line

Nudge labels away from a line
stat_dens1d_labels

Replace labels in data based on 1D density
stat_dens1d_filter

Filter observations by local 1D density
stat_fit_glance

One row summary data frame for a fitted model
quadrant_example.df

Example gene expression data
reverselog_trans

Reverse log transformation
volcano_example.df

Example gene expression data
stat_fmt_tb

Select and slice a tibble nested in data
stat_peaks

Local maxima (peaks) or minima (valleys)
symmetric_limits

Expand a range to make it symmetric
try_data_frame

Convert an R object into a tibble
stat_fit_residuals

Residuals from a model fit
ttheme_gtdefault

Table themes
ttheme_set

Set default table theme
xy_outcomes2factor

Convert two numeric ternary outcomes into a factor
stat_fit_deviations

Residuals from model fit as segments
stat_quadrant_counts

Number of observations in quadrants
stat_fit_augment

Augment data with fitted values and statistics
stat_poly_eq

Equation, p-value, R^2, AIC or BIC of fitted polynomial
Moved

Moved to package 'gginnards'
stat_dens2d_labels

Replace labels in data based on 2D density
scale_x_logFC

Position scales for log fold change data
stat_dens2d_filter

Filter observations by local 2D density
scale_shape_outcome

Shape scale for ternary outcomes