Learn R Programming

⚠️There's a newer version (0.1.9) of this package.Take me there.

datasauRus

This package wraps the awesome Datasaurus Dozen datasets. The Datasaurus Dozen show us why visualisation is important – summary statistics can be the same but distributions can be very different. In short, this package gives a fun alternative to Anscombe’s Quartet, available in R as anscombe.

The original Datasaurus was created by Alberto Cairo in this great blog post.

The other Dozen were generated using simulated annealing and the process is described in the paper “Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing” by Justin Matejka and George Fitzmaurice (open access materials including manuscript and code, official paper).

In the paper, Justin and George simulate a variety of datasets that the same summary statistics to the Datasaurus but have very different distributions.

Install

The latest stable version (0.1.2) is available on CRAN

install.packages("datasauRus")

You can get the latest development version from GitHub, so use devtools to install the package

devtools::install_github("lockedata/datasauRus")

Usage

You can use the package to produce Anscombe plots and more.

library(ggplot2)
#> Warning: package 'ggplot2' was built under R version 3.5.1
library(datasauRus)
ggplot(datasaurus_dozen, aes(x=x, y=y, colour=dataset))+
  geom_point()+
  theme_void()+
  theme(legend.position = "none")+
  facet_wrap(~dataset, ncol=3)

Contributing to the package

Wanna report a bug or suggest a feature? Great stuff! For more information on how to contribute check out our contributing guide.

Please note that this R package is released with a Contributor Code of Conduct. By participating in this package project you agree to abide by its terms.

Copy Link

Version

Install

install.packages('datasauRus')

Monthly Downloads

1,691

Version

0.1.4

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Stephanie Locke

Last Published

September 20th, 2018

Functions in datasauRus (0.1.4)

datasauRus

datasauRus
twelve_from_slant_long

Twelve From Slant (long) data
datasaurus_dozen

Datasaurus Dozen data
simpsons_paradox_wide

Simpsons Paradox (wide) data
box_plots

Box plot data
twelve_from_slant_alternate_long

Twelve From Slant Alternate (long) data
twelve_from_slant_alternate_wide

Twelve From Slant Alternate (wide) data
twelve_from_slant_wide

Twelve From Slant (wide) data
simpsons_paradox

Simpsons Paradox data
datasaurus_dozen_wide

Datasaurus Dozen (wide) data