Learn R Programming

⚠️There's a newer version (0.6.3) of this package.Take me there.

tidystats

Authors: Willem Sleegers, Arnoud Plantinga License: MIT

tidystats is a package to easily create a text file containing the output of statistical models. The goal of this package is to help researchers accompany their manuscript with an organized data file of statistical results in order to greatly improve the reliability of meta-research and to reduce statistical reporting errors.

To make this possible, tidystats relies on tidy data principles to combine the output of statistical analyses such as t-tests, correlations, ANOVAs, and regression.

Besides enabling you to create an organized data file of statistical results, the tidystats package also contains functions to help you report statistics in APA style. Results can be reported using R Markdown or using a new built-in Shiny app. Additionally, development has started on a Google Docs plugin that uses a tidystats data file to report statistics.

Please see below for instructions on how to install and use this package. Do note that the package is currently in development. This means the package may contain bugs and is subject to significant changes. If you find any bugs or if you have any feedback, please let me know by creating an issue here on Github (it's really easy to do!).

Installation

tidystats can be installed from CRAN and the latest version can be installed from Github using devtools.

library(devtools)
install_github("willemsleegers/tidystats")

Setup

Load the package and start by creating an empty list to store the results of statistical models in.

library(tidystats)

results <- list()

Usage

The main function is add_stats(). The function has 2 necessary arguments:

  • results: The list you want to add the statistical output to.
  • output: The output of a statistical test you want to add to the list (e.g.,

the output of t.test() or lm())

Optionally you can also add an identifier, type, whether the analysis was confirmatory or exploratory, and additional notes using the identifier, type, confirmatory, and notes arguments, respectively.

The identifier is used to identify the model (e.g., 'weight_height_correlation'). If you do not provide one, one is automatically created for you.

The type argument is used to indicate whether the statistical test is a hypothesis test, manipulation check, contrast analysis, or other kind of analysis such as descriptives. This can be used to distinguish the vital statistical tests from those less relevant.

The confirmatory argument is used to indicate whether the test was confirmatory or exploratory. It can also be ommitted.

The notes argument is used to add additional information which you may find fruitful. Some statistical tests have default notes output (e.g., t-tests), which will be overwritten when a notes argument is supplied to the add_stats() function.

Supported statistical functions

Package: stats

  • t.test()
  • cor.test()
  • lm()
  • glm()
  • aov()
  • chisq.test()
  • wilcox.test()
  • fisher.test()

Package: psych

  • alpha()
  • corr.test()
  • ICC()

Package: lme4 and lmerTest

  • lmer()

Example

In the following example we perform several statistical tests on a data set, add the output of these results to a list, and save the results to a file.

The data set is called cox and contains the data of a replication attempt of C.R. Cox, J. Arndt, T. Pyszczynski, J. Greenberg, A. Abdollahi, and S. Solomon (2008, JPSP, 94(4), Exp. 6) by Wissink et al. The replication study was part of the Reproducibility Project (see https://osf.io/ezcuj/). The data set is part of the tidystats package.

# Perform analyses
M1_condition <- t.test(call_parent ~ condition, data = cox, paired = TRUE)
M2_parent_siblings <- cor.test(cox$call_parent, cox$call_siblings, 
  alternative = "greater")
M3_condition_anxiety <- lm(call_parent ~ condition * anxiety , data = cox)
M4_condition_sex <- aov(call_parent ~ condition * sex, data = cox)

# Add results
results <- results %>%
  add_stats(M1_condition) %>%
  add_stats(M2_parent_siblings) %>%
  add_stats(M3_condition_anxiety) %>%
  add_stats(M4_condition_sex)

To write the results to a file, use write_stats() with the results list as the first argument.

write_stats(results, "data/results.csv")

To see how the data was actually tidied, you can open the .csv file or you can convert the tidystats results list to a table, as shown below.

library(dplyr)
library(knitr)
options(knitr.kable.NA = '-')

results %>%
  stats_list_to_df() %>%
  select(-notes) %>%
  kable()
identifiergroupterm_nrtermstatisticvaluemethod
M1_condition---mean of the differences-2.7700000Paired t-test
M1_condition---t-1.2614135Paired t-test
M1_condition---df99.0000000Paired t-test
M1_condition---p0.2101241Paired t-test
M1_condition---95% CI lower-7.1272396Paired t-test
M1_condition---95% CI upper1.5872396Paired t-test
M1_condition---null value0.0000000Paired t-test
M2_parent_siblings---r-0.0268794Pearson's product-moment correlation
M2_parent_siblings---t-0.3783637Pearson's product-moment correlation
M2_parent_siblings---df198.0000000Pearson's product-moment correlation
M2_parent_siblings---p0.6472171Pearson's product-moment correlation
M2_parent_siblings---95% CI lower-0.1430882Pearson's product-moment correlation
M2_parent_siblings---95% CI upper1.0000000Pearson's product-moment correlation
M2_parent_siblings---null value0.0000000Pearson's product-moment correlation
M3_condition_anxietycoefficients1(Intercept)b29.4466534Linear model
M3_condition_anxietycoefficients1(Intercept)SE9.9311192Linear model
M3_condition_anxietycoefficients1(Intercept)t2.9650891Linear model
M3_condition_anxietycoefficients1(Intercept)p0.0034017Linear model
M3_condition_anxietycoefficients1(Intercept)df196.0000000Linear model
M3_condition_anxietycoefficients2conditionmortality salienceb20.2945974Linear model
M3_condition_anxietycoefficients2conditionmortality salienceSE14.0193962Linear model
M3_condition_anxietycoefficients2conditionmortality saliencet1.4476085Linear model
M3_condition_anxietycoefficients2conditionmortality saliencep0.1493242Linear model
M3_condition_anxietycoefficients2conditionmortality saliencedf196.0000000Linear model
M3_condition_anxietycoefficients3anxietyb-1.5511207Linear model
M3_condition_anxietycoefficients3anxietySE3.0119376Linear model
M3_condition_anxietycoefficients3anxietyt-0.5149910Linear model
M3_condition_anxietycoefficients3anxietyp0.6071396Linear model
M3_condition_anxietycoefficients3anxietydf196.0000000Linear model
M3_condition_anxietycoefficients4conditionmortality salience:anxietyb-5.5666889Linear model
M3_condition_anxietycoefficients4conditionmortality salience:anxietySE4.3104789Linear model
M3_condition_anxietycoefficients4conditionmortality salience:anxietyt-1.2914316Linear model
M3_condition_anxietycoefficients4conditionmortality salience:anxietyp0.1980750Linear model
M3_condition_anxietycoefficients4conditionmortality salience:anxietydf196.0000000Linear model
M3_condition_anxietymodel--R squared0.0360246Linear model
M3_condition_anxietymodel--adjusted R squared0.0212698Linear model
M3_condition_anxietymodel--F2.4415618Linear model
M3_condition_anxietymodel--numerator df3.0000000Linear model
M3_condition_anxietymodel--denominator df196.0000000Linear model
M3_condition_anxietymodel--p0.0655150Linear model
M4_condition_sex-1conditiondf1.0000000Factorial ANOVA
M4_condition_sex-1conditionSS383.6450000Factorial ANOVA
M4_condition_sex-1conditionMS383.6450000Factorial ANOVA
M4_condition_sex-1conditionF1.7299360Factorial ANOVA
M4_condition_sex-1conditionp0.1899557Factorial ANOVA
M4_condition_sex-2sexdf1.0000000Factorial ANOVA
M4_condition_sex-2sexSS1140.4861329Factorial ANOVA
M4_condition_sex-2sexMS1140.4861329Factorial ANOVA
M4_condition_sex-2sexF5.1426918Factorial ANOVA
M4_condition_sex-2sexp0.0244352Factorial ANOVA
M4_condition_sex-3condition:sexdf1.0000000Factorial ANOVA
M4_condition_sex-3condition:sexSS66.1529617Factorial ANOVA
M4_condition_sex-3condition:sexMS66.1529617Factorial ANOVA
M4_condition_sex-3condition:sexF0.2982976Factorial ANOVA
M4_condition_sex-3condition:sexp0.5855728Factorial ANOVA
M4_condition_sex-4Residualsdf196.0000000Factorial ANOVA
M4_condition_sex-4ResidualsSS43466.5909054Factorial ANOVA
M4_condition_sex-4ResidualsMS221.7683209Factorial ANOVA

Report functions

There are two ways to report your results using tidystats: Using R Markdown or using a built-in Shiny app. In both cases, you need the tidystats list that contains the tidied output of your statistical tests.

If you have previously created a tidystats file, you can read in this file to re-create the tidystats list, using the read_stats() function.

results <- read_stats("data/results.csv")

Shiny app

If you do not want to use R Markdown, you can use the built-in Shiny app to interactively produce APA-output and copy it to your manuscript. To start the app, run the inspect() function.

The inspect() function takes the tidystats list as its first argument, optionally followed by one or more identifiers. If no identifiers are provided, all models will be displayed. The results of each model will be displayed in a table and you can click on a row to produce APA output. This APA output will appear in a textbox at the bottom, next to a copy button that can be pressed to copy the results into your clipboard. See below for an example.

R Markdown

You can use the report() function to report your results via R Markdown. This function requires at minimum the tidystats list and an identifier identifying the exact test you want to report. It may also be necessary to provide additional information, such as a term in a regression, for the report() function to figure out what you want to report.

To reduce repetition, you can use options() to set the default tidystats list to use. This way the report() function requires one fewer argument. You set the default tidystats list by running the following code:

options(tidystats_list = results)

To figure out how to report the output in APA style, the report() function uses the method information stored in the tidied model. For example, the model with identifier 'M1' is a paired t-test. report() will parse this, see that it is part of the t-test family, and produce results accordingly.

Below is a list of common report examples:

codeoutput
report("M1_condition")t(99) = -1.26, p = .21, 95% CI [-7.13, 1.59]
report("M1_condition", statistic = "t")-1.26
report("M2_parent_siblings")r(198) = -.027, p = .65
report("M3_condition_anxiety", term = "conditionmortality salience")b = 20.29, SE = 14.02, t(196) = 1.45, p = .15
report("M3_condition_anxiety", term_nr = 2)b = 20.29, SE = 14.02, t(196) = 1.45, p = .15
report("M3_condition_anxiety", term = "(Model)")adjusted R2 = .0035, F(1, 198) = 1.70, p = .19
report("M4_condition_sex", term = "condition:sex")F(1, 196) = 0.30, p = .59

As you can see in the examples above, you can use report() to produce a full line of output. You can also retrieve a single statistic by using the statistic argument. Additionally, you can refer to terms using either the term label or the term number (and in some cases, using a group). Although it may be less descriptive to use a term number, it reduces the amount of code clutter in your Markdown document. Our philosophy is, in line with Markdown's general writing philosophy, that the code should not distract from writing. To illustrate, writing part of a results section with tidystats will look like this:

We found no significant difference between the mortality salience condition

and the dental pain condition on the number of minutes allocated to calling one's parents, r report("M1_condition").

To execute the code, the code segment should be surrounded by backward ticks (see http://rmarkdown.rstudio.com/lesson-4.html), which results in:

We found no significant difference between the mortality salience condition

and the dental pain condition on the number of minutes allocated to calling one's parents, t(99) = -1.26, p = .21, 95% CI [-7.13, 1.59].

Helper functions

Descriptives

Since it's common to also report descriptives in addition to the statistical results, we have added a hopefully useful describe_data() and count_data() function to calculate common descriptive statistics that can be tidied and added to a results data frame. Several examples follow using the cox data.

# Descriptives of the 'anxiety' variable
describe_data(cox, anxiety)
## # A tibble: 1 x 13
##   var   missing     n     M    SD     SE   min   max range median  mode
##   <chr>   <int> <int> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl>  <dbl> <dbl>
## 1 anxi…       0   200  3.22 0.492 0.0348  1.38  4.38     3   3.25   3.5
## # ... with 2 more variables: skew <dbl>, kurtosis <dbl>
# By condition
cox %>%
  group_by(condition) %>%
  describe_data(anxiety)
## # A tibble: 2 x 14
## # Groups:   condition [2]
##   var   condition missing     n     M    SD     SE   min   max range median
##   <chr> <chr>       <int> <int> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl>  <dbl>
## 1 anxi… dental p…       0   100  3.26 0.497 0.0497  1.62  4.38  2.75   3.38
## 2 anxi… mortalit…       0   100  3.17 0.485 0.0485  1.38  4.38  3      3.25
## # ... with 3 more variables: mode <dbl>, skew <dbl>, kurtosis <dbl>
# Descriptives of a non-numeric variable
count_data(cox, condition)
## # A tibble: 2 x 4
##   var       group                  n   pct
##   <chr>     <chr>              <int> <dbl>
## 1 condition dental pain          100    50
## 2 condition mortality salience   100    50

If you use the describe_data() and count_data() function from the tidystats package to get the descriptives, you can use the tidy_describe_data() and tidy_count_data() function to tidy the output, and consequently add it to a results list.

(Note: This will soon be improved)

anxiety_tidy <- cox %>%
  describe_data(anxiety) %>%
  tidy_describe_data()

results <- results %>%
  add_stats(anxiety_tidy, type = "d", notes = "Anxious attachment style")
## Warning in add_stats.data.frame(., anxiety_tidy, type = "d", notes =
## "Anxious attachment style"): You added a data.frame to your results list.
## Please make sure it is properly tidied.

Copy Link

Version

Install

install.packages('tidystats')

Monthly Downloads

662

Version

0.3

License

MIT + file LICENSE

Maintainer

Willem Sleegers

Last Published

January 3rd, 2019

Functions in tidystats (0.3)

correlation_table

Create a correlation table
report_anova

Report method for ANOVA models
count_data

Count the total of observations
inspect.list

Inspect (a) statistical model(s) added to your tidystats list
report_chi_squared

Report function for a chi-squared test
report_fisher

Report function for Fisher's Exact Tests for Count Data
report_p_value

Report p-value function
add_stats.data.frame

add_stats data frame function
add_stats

Add statistical output to a tidy stats list
inspect_click_script

Load in Javascript code to figure out clicks in the inspect() function
report_glm

Report method for generalized linear models
add_stats_to_model

Add statistical output to a model in a tidy stats list
cox

Data of a replication study of C.R. Cox, J. Arndt, T. Pyszczynski, J. Greenberg, A. Abdollahi, S. Solomon (2008, JPSP, 94(4), Exp. 6)
css_style

Load in CSS code to style HTML content
copy_to_clipboard_script

Load in Javascript code to copy content to the clipboard.
rename_columns

Rename statistics columns
report_rma

Report method for metafor's rma models
tidy_stats.aovlist

Create a tidy stats data frame from an aovlist object
report

Report function
inspect

Inspect the output of (a) statistical model(s) via an interactive Shiny app.
describe_data

Calculate common descriptive statistics
report_table_lm

Report table method for linear regression models
tidy_stats.confint

tidy_stats method for confint output.
read_stats

Read a .csv file that was produced with write_stats
tidy_stats.rma

Create a tidy stats data frame from an rma object from the metafor package
report_statistic

Report a single statistic
write_stats

Save the results in a tidy stats list to a .csv file
tidy_stats.anova

Create a tidy stats data frame from an anova object
add_stats.default

add_stats default function
add_stats.matrix

add_stats matrix function
tidy_stats.aov

Create a tidy stats data frame from an aov object
report_correlation

Report function for correlations
report_descriptives

Report descriptives helper functions
report_wilcoxon

Report function for Wilcoxon Rank Sum and Signed Rank Tests
report_lm

Report method for linear regression models
tidy_stats.glm

Create a tidy stats data frame from a glm object
tidy_stats.htest

Create a tidy stats data frame from an htest object
report_lmm

Report method for linear mixed models
stats_list_to_df

Convert a tidy stats list to a data frame
tidy_count_data

Convert count data to a tidy data frame
report_t_test

Report function for t-tests
tidy_describe_data

Convert descriptives to a tidy data frame
tidy_stats

Create a tidy stats data frame from a statistical output object
tidy_stats.lm

Create a tidy stats data frame from an lm object
tidy_stats.lmerModLmerTest

Create a tidy stats data frame from an lmerModLmerTest object
tidy_stats.psych

tidy_stats method for psych's alpha objects
tidy_stats.lmerMod

Create a tidy stats data frame from an lmerMod object
inspect.default

Inspect (a) statistical model(s) output