⚠️There's a newer version (0.12.3) of this package. Take me there.

ggstatsplot: ggplot2 Based Plots with Statistical Details

PackageStatusUsageGitHubMiscellaneous

Raison d’être

“What is to be sought in designs for the display of information is the clear portrayal of complexity. Not the complication of the simple; rather … the revelation of the complex.”

  • Edward R. Tufte

ggstatsplot is an extension of ggplot2 package for creating graphics with details from statistical tests included in the information-rich plots themselves. In a typical exploratory data analysis workflow, data visualization and statistical modeling are two different phases: visualization informs modeling, and modeling in its turn can suggest a different visualization method, and so on and so forth. The central idea of ggstatsplot is simple: combine these two phases into one in the form of graphics with statistical details, which makes data exploration simpler and faster.

Summary of available plots

It, therefore, produces a limited kinds of plots for the supported analyses:

FunctionPlotDescription
ggbetweenstatsviolin plotsfor comparisons between groups/conditions
ggwithinstatsviolin plotsfor comparisons within groups/conditions
gghistostatshistogramsfor distribution about numeric variable
ggdotplotstatsdot plots/chartsfor distribution about labeled numeric variable
ggscatterstatsscatterplotsfor correlation between two variables
ggcorrmatcorrelation matricesfor correlations between multiple variables
ggpiestatspie chartsfor categorical data
ggbarstatsbar chartsfor categorical data
ggcoefstatsdot-and-whisker plotsfor regression models and meta-analysis

In addition to these basic plots, ggstatsplot also provides grouped_ versions (see below) that makes it easy to repeat the same analysis for any grouping variable.

Summary of types of statistical analyses

The table below summarizes all the different types of analyses currently supported in this package-

FunctionsDescriptionParametricNon-parametricRobustBayesian
ggbetweenstatsBetween group/condition comparisonsYesYesYesYes
ggwithinstatsWithin group/condition comparisonsYesYesYesYes
gghistostats, ggdotplotstatsDistribution of a numeric variableYesYesYesYes
ggcorrmatCorrelation matrixYesYesYesYes
ggscatterstatsCorrelation between two variablesYesYesYesYes
ggpiestats, ggbarstatsAssociation between categorical variablesYesNANAYes
ggpiestats, ggbarstatsEqual proportions for categorical variable levelsYesNANAYes
ggcoefstatsRegression model coefficientsYesYesYesYes
ggcoefstatsRandom-effects meta-analysisYesNAYesYes

Summary of Bayesian analysis

AnalysisHypothesis testingEstimation
(one/two-sample) t-testYesYes
one-way ANOVAYesYes
correlationYesYes
(one/two-way) contingency tableYesYes
random-effects meta-analysisYesYes

Statistical reporting

For all statistical tests reported in the plots, the default template abides by the APA gold standard for statistical reporting. For example, here are results from Yuen’s test for trimmed means (robust t-test):

Summary of statistical tests and effect sizes

Here is a summary table of all the statistical tests currently supported across various functions: https://indrajeetpatil.github.io/statsExpressions/articles/stats_details.html

Installation

To get the latest, stable CRAN release:

install.packages("ggstatsplot")

Note:

Linux users may encounter some installation problems, as several R packages require external libraries on the system, especially for PMCMRplus package. The following README file briefly describes the installation procedure: https://CRAN.R-project.org/package=PMCMRplus/readme/README.html

You can get the development version of the package from GitHub.

If you are in hurry and want to reduce the time of installation, prefer-

# needed package to download from GitHub repo
install.packages("remotes")

# downloading the package from GitHub (needs `remotes` package to be installed)
remotes::install_github(
  repo = "IndrajeetPatil/ggstatsplot", # package path on GitHub
  dependencies = FALSE, # assumes you have already installed needed packages
  quick = TRUE # skips docs, demos, and vignettes
)

If time is not a constraint-

remotes::install_github(
  repo = "IndrajeetPatil/ggstatsplot", # package path on GitHub
  dependencies = TRUE, # installs packages which ggstatsplot depends on
  upgrade_dependencies = TRUE # updates any out of date dependencies
)

To see what new changes (and bug fixes) have been made to the package since the last release on CRAN, you can check the detailed log of changes here: https://indrajeetpatil.github.io/ggstatsplot/news/index.html

Citation

If you want to cite this package in a scientific journal or in any other context, run the following code in your R console:

citation("ggstatsplot")

  Patil, I. (2018). Visualizations with statistical details: The
  'ggstatsplot' approach. PsyArxiv. doi:10.31234/osf.io/p7mku

A BibTeX entry for LaTeX users is

  @Article{,
    title = {Visualizations with statistical details: The 'ggstatsplot' approach},
    author = {Indrajeet Patil},
    year = {2021},
    journal = {PsyArxiv},
    url = {https://psyarxiv.com/p7mku/},
    doi = {10.31234/osf.io/p7mku},
  }

There is currently a publication in preparation corresponding to this package and the citation will be updated once it’s published.

Documentation and Examples

To see the detailed documentation for each function in the stable CRAN version of the package, see:

To see the documentation relevant for the development version of the package, see the dedicated website for ggstatplot, which is updated after every new commit: https://indrajeetpatil.github.io/ggstatsplot/.

Primary functions

Here are examples of the main functions currently supported in ggstatsplot.

Note: If you are reading this on GitHub repository, the documentation below is for the development version of the package. So you may see some features available here that are not currently present in the stable version of this package on CRAN. For documentation relevant for the CRAN version, see: https://CRAN.R-project.org/package=ggstatsplot/readme/README.html

ggbetweenstats

This function creates either a violin plot, a box plot, or a mix of two for between-group or between-condition comparisons with results from statistical tests in the subtitle. The simplest function call looks like this-

# for reproducibility
set.seed(123)
library(ggstatsplot)

# plot
ggbetweenstats(
  data = iris,
  x = Species,
  y = Sepal.Length,
  title = "Distribution of sepal length across Iris species"
)

Copy Link

Version

Down Chevron

Install

install.packages('ggstatsplot')

Monthly Downloads

11,396

Version

0.7.2

License

GPL-3 | file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Last Published

April 12th, 2021

Functions in ggstatsplot (0.7.2)