simpsons_paradox

A dataset demonstrating Simpson's Paradox with a strongly positively correlated dataset (<code>simpson_1</code>)
and a dataset with the same positive correlation as <code>simpson_1</code>, but where individual groups have a
strong negative correlation (<code>simpson_2</code>).

datasets

The Datasaurus Dozen is a set of datasets with the same summary statistics. They
retain the same summary statistics despite having radically different distributions.
The datasets represent a larger and quirkier object lesson that is typically taught
via Anscombe's Quartet (available in the 'datasets' package). Anscombe's Quartet
contains four very different distributions with the same summary statistics and as
such highlights the value of visualisation in understanding data, over and above
summary statistics. As well as being an engaging variant on the Quartet, the data
is generated in a novel way. The simulated annealing process used to derive datasets
from the original Datasaurus is detailed in "Same Stats, Different Graphs: Generating
Datasets with Varied Appearance and Identical Statistics through Simulated Annealing"
<doi:10.1145/3025453.3025912>.

Stephanie Locke

datasauRus

Datasets from the Datasaurus Dozen

Steph Locke

Alberto Cairo

Justin Matejka

George Fitzmaurice

Lucy D'Agostino McGowan

Richard Cotton

Locke Data 

simpsons_paradox function

A data frame with 444 rows and 3 variables:<ul>
<li>dataset: indicates which of the two datasets the data are from, <code>simpson_1</code> or <code>simpson_2</code></li>
<li>x: x-values</li>
<li>y: y-values</li>
</ul>

Format

Simpsons Paradox data — simpsons_paradox

Simpsons Paradox data

simpsons_paradox: Simpsons Paradox data

Description

Usage

Arguments

Format

References

Examples