Learn R Programming

papaja: Prepare APA Journal Articleswith R Markdown

papaja is an award-winning R package that facilitates creating computationally reproducible, submission-ready manuscripts which conform to the American Psychological Association (APA) manuscript guidelines (6th Edition). papaja provides

  • an R Markdown template that can be used with (or without) RStudio to create PDF documents (using the apa6 LaTeX class) or Word documents (using a .docx-reference file).
  • Functions to typeset the results from statistical analyses,
  • functions to create tables, and
  • functions to create figures in accordance with APA guidelines.

For a comprehensive introduction to papaja, see the current draft of the manual. If you have a specific question that is not answered in the manual, feel free to ask a question on Stack Overflow using the papaja tag. If you believe you have found a bug or would like to request a new feature, open an issue on Github and provide a minimal complete verifiable example.

Example

Take a look at the source file of the package vignette and the resulting PDF. The vignette also contains some basic instructions.

Installation

To use papaja you need either a recent version of RStudio or pandoc. If you want to create PDF- in addition to DOCX-documents you additionally need a TeX distribution. We recommend you use TinyTex, which can be installed from within R:

if(!requireNamespace("tinytex", quietly = TRUE)) install.packages("tinytex")

tinytex::install_tinytex()

You may also consider MikTeX for Windows, MacTeX for Mac, or TeX Live for Linux. Please refer to the papaja manual for detailed installation instructions.

papaja is available on CRAN but you can also install it from the GitHub repository:

# Install latest CRAN release
install.packages("papaja")

# Install remotes package if necessary
if(!requireNamespace("remotes", quietly = TRUE)) install.packages("remotes")

# Install the stable development version from GitHub
remotes::install_github("crsh/papaja")

Usage

Once papaja is installed, you can select the APA template when creating a new R Markdown file through the RStudio menus.

To add citations, specify your bibliography-file in the YAML front matter of the document (bibliography: my.bib) and start citing (for details, see pandoc manual on the citeproc extension. You may also be interested in citr, an R Studio addin to swiftly insert Markdown citations and R Studio’s visual editor, which also enables swiftly inserting citations.

Typeset analysis results

The functions apa_print() and apa_table() facilitate reporting results of your analyses. When you pass the an output object of a supported class, such as an htest- or lm-object, to apa_print(), it will return a list of character strings that you can use to report the results of your analysis.

my_lm <- lm(
  Sepal.Width ~ Sepal.Length + Petal.Width + Petal.Length
  , data = iris
)
apa_lm <- apa_print(my_lm)

apa_lm$full_result$Sepal_Length
## [1] "$b = 0.61$, 95\\% CI $[0.48, 0.73]$, $t(146) = 9.77$, $p < .001$"

papaja currently provides methods for the following object classes:

A-BD-LL-SS-Z
afex_aovdefaultlsmobjsummary.aovlist
anovaemmGridmanovasummary.glht
anova.lmeglhtmerModsummary.glm
Anova.mlmglmmixedsummary.lm
aovhtestpapaja_wscisummary.manova
aovlistlistsummary_emmsummary.ref.grid
BFBayesFactorlmsummary.Anova.mlm
BFBayesFactorToplmesummary.aov

Create tables

apa_table() may be used to produce publication-ready tables in an R Markdown document. For instance, you might want to report some condition means (with standard errors).

npk |>
  # Summarize data
  dplyr::group_by(N, P) |>
  dplyr::summarise(
    mean = mean(yield)
    , se = sd(yield) / sqrt(length(yield))
    , .groups = "drop"
  ) |>
  # Label columns 
  label_variables(
    N = "Nitrogen"
    , P = "Phosphate"
    , mean = "*M*"
    , se = "*SE*"
  ) |>
  # Print table
  apa_table(caption = "Mean pea yield (with standard errors)")

Table 1. Mean pea yield (with standard errors)

NitrogenPhosphateMSE
0051.721.88
0152.422.65
1059.222.66
1156.152.08

This is a fairly simple example, but apa_table() may be used to generate more complex tables.

apa_table(), of course, plays nicely with the output from apa_print(). Thus, it is possible to conveniently report complete regression tables, ANOVA tables, or the output from mixed-effects models.

lm(Sepal.Width ~ Sepal.Length + Petal.Width + Petal.Length, data = iris) |>
  apa_print() |>
  apa_table(caption = "Iris regression table.")

Table 2. Iris regression table.

Predictorb95% CItdfp
Intercept1.04[0.51, 1.58]3.85146< .001
Sepal Length0.61[0.48, 0.73]9.77146< .001
Petal Width0.56[0.32, 0.80]4.55146< .001
Petal Length-0.59[-0.71, -0.46]-9.43146< .001

Create figures

papaja further provides functions to create publication-ready plots. For example, you can use apa_barplot(), apa_lineplot(), and apa_beeplot() (or the general function apa_factorial_plot()) to visualize the results of factorial study designs:

apa_beeplot(
  data = stroop_data
  , dv = "response_time"
  , id = "id"
  , factors = c("congruency", "load")
  , ylim = c(0, 800)
  , dispersion = wsci # within-subjects confidence intervals
  , conf.level = .99
  , las = 1
)

If you prefer ggplot2, try theme_apa().

library("ggplot2")
library("ggforce")

p <- ggplot(
  stroop_data
  , aes(x = congruency, y = response_time, shape = load, fill = load)
) +
  geom_violin(alpha = 0.2, color = grey(0.6)) +
  geom_sina(color = grey(0.6)) +
  stat_summary(position = position_dodge2(0.95), fun.data = mean_cl_normal) +
  lims(y = c(0, max(stroop_data$response_time))) +
  scale_shape_manual(values = c(21, 22)) +
  scale_fill_grey(start = 0.6, end = 1) +
  labs(
    x = "Congruency"
    , y = "Response time"
    , shape = "Cognitive load"
    , fill = "Cognitive load"
  )

p + theme_apa()
## Warning: Computation failed in `stat_summary()`
## Caused by error in `fun.data()`:
## ! The package "Hmisc" is required.

Usage without RStudio

Don’t use RStudio? No problem. Use the rmarkdown::render function to create articles:

# Create new R Markdown file
rmarkdown::draft(
  "mymanuscript.Rmd"
  , "apa6"
  , package = "papaja"
  , create_dir = FALSE
  , edit = FALSE
)

# Render manuscript
rmarkdown::render("mymanuscript.Rmd")

Getting help

For a comprehensive introduction to papaja, check out the current draft of the papaja manual. If you have a specific question that is not answered in the manual, feel free to ask a question on Stack Overflow using the papaja tag. If you believe you have found a bug or you want to request a new feature, open an issue on Github and provide a minimal complete verifiable example.

Citation

Please cite papaja if you use it.

Aust, F. & Barth, M. (2023). papaja: Prepare reproducible APA journal articles with R Markdown. R package version 0.1.2. Retrieved from https://github.com/crsh/papaja

For convenience, you can use cite_r() or copy the reference information returned by citation('papaja') to your BibTeX file:


@Manual{,
  title = {{papaja}: {Prepare} reproducible {APA} journal articles with {R Markdown}},
  author = {Frederik Aust and Marius Barth},
  year = {2023},
  note = {R package version 0.1.2},
  url = {https://github.com/crsh/papaja},
}

papaja in the wild

If you are interested in seeing how others are using papaja, you can find a collection of papers and the corresponding R Markdown files in the manual.

If you have published a paper that was written with papaja, please add the reference to the public Zotero group yourself or send us to me.

Computational reproducibility

To ensure mid- to long-term computational reproducibility we highly recommend conserving the software environment used to write a manuscript (e.g. R and all R packages) either in a software container or a virtual machine. This way you can be sure that your R code does not break because of updates to R or any R package. For a brief primer on containers and virtual machines see the supplementary material by Klein et al. (2018).

Docker is the most widely used containerization approach. It is open source and free to use but requires some disk space. CodeOcean is a commercial service that builds on Docker, facilitates setting up and sharing containers and lets you run computations in the cloud. See the papaja manual on how to get started using papaja with Docker or CodeOcean and our Docker workflow tailored for easy use with papaja.

Contribute

Like papaja and want to contribute? We highly appreciate any contributions to the R package or its documentation. Take a look at the open issues if you need inspiration. There are many additional analyses that we would like apa_print() to support. Any new S3/S4-methods for this function are always appreciated (e.g., factanal, fa, lavaan). For a primer on adding new apa_print()-methods, see the getting-started-vignette:

vignette("extending_apa_print", package = "papaja")

Before working on a contribution, please review our brief contributing guidelines and code of conduct.

Related R packages

By now, there are a couple of R packages that provide convenience functions to facilitate the reporting of statistics in accordance with APA guidelines.

  • apa: Format output of statistical tests in R according to APA guidelines
  • APAstats: R functions for formatting results in APA style and other stuff
  • apaTables: Create American Psychological Association (APA) Style Tables
  • rempsyc: Convenience functions for psychology
  • sigr: Concise formatting of significances in R

If you are looking for other journal article templates, you may be interested in the rticles package.

Package dependencies

Copy Link

Version

Install

install.packages('papaja')

Monthly Downloads

2,029

Version

0.1.2

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Frederik Aust

Last Published

September 29th, 2023

Functions in papaja (0.1.2)

apa_factorial_plot_single

Plots for factorial designs that conform to APA guidelines, two-factors internal function
apa_print

Typeset Statistical Results
apa_print.glht

Typeset Statistical Results from General Linear Hypothesis Tests
apa_print.emmGrid

Typeset Statistical Results from Estimated Marginal Means
apa_print.aov

Typeset Statistical Results from ANOVA
apa_interval

Typeset Interval Estimate
apa_num

Typeset Numerical Values for Printing and Reporting
add_row_names

Add row names as first column
apa6_pdf

APA manuscript (6th edition)
apa_p

Prepare Numeric Values for Printing as p value
apa_lineplot

Line Plots for Factorial Designs that Conform to APA Guidelines
apa_print.list

Typeset Statistical Results from Linear-Model Comparisons
apa_print.lme

Typeset Statistical Results from Nonlinear Hierarchical Models
apa_barplot

Bar Plots for Factorial Designs that Conform to APA Guidelines
apa_table

Prepare Table for Printing and Reporting
arrange_anova

Create Variance Table from Various ANOVA objects
apa_prepare_doc

Prepare APA document (deprecated)
apa_print.merMod

Typeset Statistical Results from Hierarchical GLM
canonize

Transform to a Canonical Table
cite_r

Cite R and R Packages
combine_plotmath

Combine to Expression
complete_observations

Remove Incomplete Observations from Data Frame
apa_print.BFBayesFactor

Typeset Bayes Factors
arrange_regression

Create a Regression Table (defunct)
beautify

Beautify a Canonical Table
fetch_zotero_refs

Save a collection from a Zotero-Account to a BibTeX-file (defunct)
apa_print.papaja_wsci

Typeset Within-Subjects Confidence Intervals
conf_int

Between-Subjects Confidence Intervals
apa_print.lm

Typeset Statistical Results from GLM
corresponding_author_line

Corresponding-Author Line
format_cells

Format numeric table cells
default_label

Set Default Variable Labels from Column Names
apa_print.htest

Typeset Statistical Results from Hypothesis Tests
beautify_terms

Prettify Term Names
brighten

Brighten up a Color
defaults

Set Defaults
fast_aggregate

Aggregate data much faster using dplyr
hd_int

Highest-Density Intervals
$.apa_results_table

Extract Parts of an APA Results Table
in_paren

Replace Parentheses with Brackets
fetch_web_refs

Fetch a .bib-reference file from the web (defunct)
escape_latex

Escape Symbols for LaTeX Output
points.matrix

Matrix Method for points()
package_available

Package Available
indent_stubs

Add stub indentation
generate_author_yml

generate_author_yml
lines.matrix

Matrix Method for lines()
glue_apa_results

Create a New apa_results Object
se

Standard Error of the Mean
papaja

Prepare APA Journal Articles with R Markdown
lookup_names

Lookup Tables for Column Names and Variable Labels
init_apa_results

Create Empty Container for Results
sel

Select Parameters
localize

Lookup Table for Generated Words and Phrases
revision_letter_pdf

Revision Letter
print_model_comp

Typeset Statistical Results from Model Comparisons
merge_tables

Merge tables in list
print_scientific

Typeset scientific notation
sort_terms

Sort ANOVA or Regression Table by Predictors/Effects
print_anova

Format statistics from ANOVA (APA 6th edition)
summary.papaja_wsci

Summarize Within-Subjects Confidence Intervals
svl

Strip Math Tags from Variable Labels or Strings
remove_comments

Remove Comments
theme_apa

APA-style ggplot2 Theme
render_appendix

Render Appendix (defunct)
quote_from_tex

Quote from TeX document
r_refs

Create a Reference File for R and R Packages
word_title_page

Create title page
wsci

Within-Subjects Confidence Intervals
sanitize_terms

Sanitize Term Names
transmute_df_into_label

Transmute Degrees-of-Freedom Columns into Variable Labels
validate

Validate Function Input
simple_codebook

Simple Codebook
sort_columns

Sort the Columns of an APA Results Table
add_col_spanners

Add table headings to group columns
add_equals

Add Equals Where Necessary
apa_beeplot

Bee-swarm Plots for Factorial Designs that Conform to APA Guidelines
apa_df

Typeset Degrees of Freedom
apa_factorial_plot

Plots for Factorial Designs that Conform to APA Guidelines
add_effect_sizes

Effect Sizes for Analysis of Variance