simpsons_paradox

0th

Percentile

Simpsons Paradox data

A dataset demonstrating Simpson's Paradox with a strongly positively correlated dataset (simpson_1) and a dataset with the same positive correlation as simpson_1, but where individual groups have a strong negative correlation (simpson_2).

Keywords
datasets
Usage
simpsons_paradox
Format

A data frame with 444 rows and 3 variables:

  • dataset: indicates which of the two datasets the data are from, simpson_1 or simpson_2

  • x: x-values

  • y: y-values

References

Matejka, J., & Fitzmaurice, G. (2017). Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing. CHI 2017 Conference proceedings: ACM SIGCHI Conference on Human Factors in Computing Systems. Retrieved from https://www.autodeskresearch.com/publications/samestats.

Aliases
  • simpsons_paradox
Examples
# NOT RUN {
if(require(ggplot2)){
  ggplot(simpsons_paradox, aes(x=x, y=y, colour=dataset))+
    geom_point()+
    theme(legend.position = "none")+
    facet_wrap(~dataset, ncol=3)
}
# }
Documentation reproduced from package datasauRus, version 0.1.4, License: MIT + file LICENSE

Community examples

Looks like there are no examples yet.