ggstatsplot
: ggplot2
Based Plots with Statistical Details
Package | Status | Usage | GitHub | Miscellaneous |
---|---|---|---|---|
Raison d’être
“What is to be sought in designs for the display of information is the clear portrayal of complexity. Not the complication of the simple; rather … the revelation of the complex.”
- Edward R. Tufte
ggstatsplot
is an
extension of ggplot2
package
for creating graphics with details from statistical tests included in
the information-rich plots themselves. In a typical exploratory data
analysis workflow, data visualization and statistical modeling are two
different phases: visualization informs modeling, and modeling in its
turn can suggest a different visualization method, and so on and so
forth. The central idea of ggstatsplot
is simple: combine these two
phases into one in the form of graphics with statistical details, which
makes data exploration simpler and faster.
Summary of available plots
It, therefore, produces a limited kinds of plots for the supported analyses:
Function | Plot | Description |
---|---|---|
ggbetweenstats | violin plots | for comparisons between groups/conditions |
ggwithinstats | violin plots | for comparisons within groups/conditions |
gghistostats | histograms | for distribution about numeric variable |
ggdotplotstats | dot plots/charts | for distribution about labeled numeric variable |
ggscatterstats | scatterplots | for correlation between two variables |
ggcorrmat | correlation matrices | for correlations between multiple variables |
ggpiestats | pie charts | for categorical data |
ggbarstats | bar charts | for categorical data |
ggcoefstats | dot-and-whisker plots | for regression models and meta-analysis |
In addition to these basic plots, ggstatsplot
also provides
grouped_
versions (see below) that makes it easy to repeat the
same analysis for any grouping variable.
Summary of types of statistical analyses
The table below summarizes all the different types of analyses currently supported in this package-
Functions | Description | Parametric | Non-parametric | Robust | Bayesian |
---|---|---|---|---|---|
ggbetweenstats | Between group/condition comparisons | Yes | Yes | Yes | Yes |
ggwithinstats | Within group/condition comparisons | Yes | Yes | Yes | Yes |
gghistostats , ggdotplotstats | Distribution of a numeric variable | Yes | Yes | Yes | Yes |
ggcorrmat | Correlation matrix | Yes | Yes | Yes | Yes |
ggscatterstats | Correlation between two variables | Yes | Yes | Yes | Yes |
ggpiestats , ggbarstats | Association between categorical variables | Yes | NA | NA | Yes |
ggpiestats , ggbarstats | Equal proportions for categorical variable levels | Yes | NA | NA | Yes |
ggcoefstats | Regression model coefficients | Yes | Yes | Yes | Yes |
ggcoefstats | Random-effects meta-analysis | Yes | NA | Yes | Yes |
Summary of Bayesian analysis
Analysis | Hypothesis testing | Estimation |
---|---|---|
(one/two-sample) t-test | Yes | Yes |
one-way ANOVA | Yes | Yes |
correlation | Yes | Yes |
(one/two-way) contingency table | Yes | Yes |
random-effects meta-analysis | Yes | Yes |
Statistical reporting
For all statistical tests reported in the plots, the default template abides by the APA gold standard for statistical reporting. For example, here are results from Yuen’s test for trimmed means (robust t-test):
Summary of statistical tests and effect sizes
Here is a summary table of all the statistical tests currently supported across various functions: https://indrajeetpatil.github.io/statsExpressions/articles/stats_details.html
Installation
To get the latest, stable CRAN
release:
install.packages("ggstatsplot")
Note:
Linux users may encounter some installation problems, as several R
packages require external libraries on the system, especially for
PMCMRplus
package. The following README
file briefly describes the
installation procedure:
https://CRAN.R-project.org/package=PMCMRplus/readme/README.html
You can get the development version of the package from GitHub
.
If you are in hurry and want to reduce the time of installation, prefer-
# needed package to download from GitHub repo
install.packages("remotes")
# downloading the package from GitHub (needs `remotes` package to be installed)
remotes::install_github(
repo = "IndrajeetPatil/ggstatsplot", # package path on GitHub
dependencies = FALSE, # assumes you have already installed needed packages
quick = TRUE # skips docs, demos, and vignettes
)
If time is not a constraint-
remotes::install_github(
repo = "IndrajeetPatil/ggstatsplot", # package path on GitHub
dependencies = TRUE, # installs packages which ggstatsplot depends on
upgrade_dependencies = TRUE # updates any out of date dependencies
)
To see what new changes (and bug fixes) have been made to the package
since the last release on CRAN
, you can check the detailed log of
changes here:
https://indrajeetpatil.github.io/ggstatsplot/news/index.html
Citation
If you want to cite this package in a scientific journal or in any other
context, run the following code in your R
console:
citation("ggstatsplot")
Patil, I. (2018). Visualizations with statistical details: The
'ggstatsplot' approach. PsyArxiv. doi:10.31234/osf.io/p7mku
A BibTeX entry for LaTeX users is
@Article{,
title = {Visualizations with statistical details: The 'ggstatsplot' approach},
author = {Indrajeet Patil},
year = {2021},
journal = {PsyArxiv},
url = {https://psyarxiv.com/p7mku/},
doi = {10.31234/osf.io/p7mku},
}
There is currently a publication in preparation corresponding to this package and the citation will be updated once it’s published.
Documentation and Examples
To see the detailed documentation for each function in the stable CRAN version of the package, see:
Presentation: https://indrajeetpatil.github.io/ggstatsplot_slides/slides/ggstatsplot_presentation.html#1
Vignettes: https://indrajeetpatil.github.io/ggstatsplot/articles/
To see the documentation relevant for the development version of the
package, see the dedicated website for ggstatplot
, which is updated
after every new commit: https://indrajeetpatil.github.io/ggstatsplot/.
Primary functions
Here are examples of the main functions currently supported in
ggstatsplot
.
Note: If you are reading this on GitHub
repository, the
documentation below is for the development version of the package.
So you may see some features available here that are not currently
present in the stable version of this package on CRAN. For
documentation relevant for the CRAN
version, see:
https://CRAN.R-project.org/package=ggstatsplot/readme/README.html
ggbetweenstats
This function creates either a violin plot, a box plot, or a mix of two for between-group or between-condition comparisons with results from statistical tests in the subtitle. The simplest function call looks like this-
# for reproducibility
set.seed(123)
library(ggstatsplot)
# plot
ggbetweenstats(
data = iris,
x = Species,
y = Sepal.Length,
title = "Distribution of sepal length across Iris species"
)