Unlimited learning, half price | 50% off

Last chance! 50% off unlimited learning

Sale ends in


⚠️There's a newer version (0.3.6) of this package.Take me there.

equatiomatic

The goal of equatiomatic is to reduce the pain associated with writing LaTeX code from a fitted model. In the future, the package aims to support any model supported by broom. See the introduction to equatiomatic for currently supported models.

Installation

equatiomatic is not yet on CRAN. Install the development version from GitHub with

remotes::install_github("datalorax/equatiomatic")

Basic usage

The gif above shows the basic functionality.

To convert a model to LaTeX, feed a model object to extract_eq():

library(equatiomatic)

# Fit a simple model
mod1 <- lm(mpg ~ cyl + disp, mtcars)

# Give the results to extract_eq
extract_eq(mod1)
#> $$
#> \operatorname{mpg} = \alpha + \beta_{1}(\operatorname{cyl}) + \beta_{2}(\operatorname{disp}) + \epsilon
#> $$

The model can be built in any standard way—it can handle shortcut syntax:

mod2 <- lm(mpg ~ ., mtcars)
extract_eq(mod2)
#> $$
#> \operatorname{mpg} = \alpha + \beta_{1}(\operatorname{cyl}) + \beta_{2}(\operatorname{disp}) + \beta_{3}(\operatorname{hp}) + \beta_{4}(\operatorname{drat}) + \beta_{5}(\operatorname{wt}) + \beta_{6}(\operatorname{qsec}) + \beta_{7}(\operatorname{vs}) + \beta_{8}(\operatorname{am}) + \beta_{9}(\operatorname{gear}) + \beta_{10}(\operatorname{carb}) + \epsilon
#> $$

When using categorical variables, it will include the levels of the variables as subscripts. Here, we use data from the {palmerpenguins} dataset.

mod3 <- lm(body_mass_g ~ bill_length_mm + species, penguins)
extract_eq(mod3)
#> $$
#> \operatorname{body\_mass\_g} = \alpha + \beta_{1}(\operatorname{bill\_length\_mm}) + \beta_{2}(\operatorname{species}_{\operatorname{Chinstrap}}) + \beta_{3}(\operatorname{species}_{\operatorname{Gentoo}}) + \epsilon
#> $$

It helpfully preserves the order the variables are supplied in the formula:

set.seed(8675309)
d <- data.frame(cat1 = rep(letters[1:3], 100),
                cat2 = rep(LETTERS[1:3], each = 100),
                cont1 = rnorm(300, 100, 1),
                cont2 = rnorm(300, 50, 5),
                out   = rnorm(300, 10, 0.5))
mod4 <- lm(out ~ cont1 + cat2 + cont2 + cat1, d)
extract_eq(mod4)
#> $$
#> \operatorname{out} = \alpha + \beta_{1}(\operatorname{cont1}) + \beta_{2}(\operatorname{cat2}_{\operatorname{B}}) + \beta_{3}(\operatorname{cat2}_{\operatorname{C}}) + \beta_{4}(\operatorname{cont2}) + \beta_{5}(\operatorname{cat1}_{\operatorname{b}}) + \beta_{6}(\operatorname{cat1}_{\operatorname{c}}) + \epsilon
#> $$

Appearance

You can wrap the equations so that a specified number of terms appear on the right-hand side of the equation using terms_per_line (defaults to 4):

extract_eq(mod2, wrap = TRUE)
#> $$
#> \begin{aligned}
#> \operatorname{mpg} &= \alpha + \beta_{1}(\operatorname{cyl}) + \beta_{2}(\operatorname{disp}) + \beta_{3}(\operatorname{hp})\ + \\
#> &\quad \beta_{4}(\operatorname{drat}) + \beta_{5}(\operatorname{wt}) + \beta_{6}(\operatorname{qsec}) + \beta_{7}(\operatorname{vs})\ + \\
#> &\quad \beta_{8}(\operatorname{am}) + \beta_{9}(\operatorname{gear}) + \beta_{10}(\operatorname{carb}) + \epsilon
#> \end{aligned}
#> $$
extract_eq(mod2, wrap = TRUE, terms_per_line = 6)
#> $$
#> \begin{aligned}
#> \operatorname{mpg} &= \alpha + \beta_{1}(\operatorname{cyl}) + \beta_{2}(\operatorname{disp}) + \beta_{3}(\operatorname{hp}) + \beta_{4}(\operatorname{drat}) + \beta_{5}(\operatorname{wt})\ + \\
#> &\quad \beta_{6}(\operatorname{qsec}) + \beta_{7}(\operatorname{vs}) + \beta_{8}(\operatorname{am}) + \beta_{9}(\operatorname{gear}) + \beta_{10}(\operatorname{carb}) + \epsilon
#> \end{aligned}
#> $$

When wrapping, you can change whether the lines end with trailing math operators like + (the default), or if they should begin with them using operator_location = "end" or operator_location = "start":

extract_eq(mod2, wrap = TRUE, terms_per_line = 4, operator_location = "start")
#> $$
#> \begin{aligned}
#> \operatorname{mpg} &= \alpha + \beta_{1}(\operatorname{cyl}) + \beta_{2}(\operatorname{disp}) + \beta_{3}(\operatorname{hp})\\
#> &\quad + \beta_{4}(\operatorname{drat}) + \beta_{5}(\operatorname{wt}) + \beta_{6}(\operatorname{qsec}) + \beta_{7}(\operatorname{vs})\\
#> &\quad + \beta_{8}(\operatorname{am}) + \beta_{9}(\operatorname{gear}) + \beta_{10}(\operatorname{carb}) + \epsilon
#> \end{aligned}
#> $$

By default, all text in the equation is wrapped in \operatorname{}. You can optionally have the variables themselves be italicized (i.e. not be wrapped in \operatorname{}) with ital_vars = TRUE:

extract_eq(mod2, wrap = TRUE, ital_vars = TRUE)
#> $$
#> \begin{aligned}
#> mpg &= \alpha + \beta_{1}(cyl) + \beta_{2}(disp) + \beta_{3}(hp)\ + \\
#> &\quad \beta_{4}(drat) + \beta_{5}(wt) + \beta_{6}(qsec) + \beta_{7}(vs)\ + \\
#> &\quad \beta_{8}(am) + \beta_{9}(gear) + \beta_{10}(carb) + \epsilon
#> \end{aligned}
#> $$

R Markdown and previewing

If you include extract_eq() in an R Markdown chunk with results="asis", knitr will render the equation.

Alternatively, you can run the code interactively, copy/paste the equation to where you want it in your document, and make any edits you’d like.

You can use the tex_preview() function from the texPreview package to preview the equation in RStudio:

tex_preview(extract_eq(mod1))

Both extract_eq() and tex_preview() work with magrittr pipes, so you can do something like this:

library(magrittr)  # or library(tidyverse) or any other package that exports %>%

extract_eq(mod1) %>% 
  tex_preview()

Extra options

There are several extra options you can enable with additional arguments to extract_eq()

Actual coefficients

You can return actual numeric coefficients instead of Greek letters with use_coefs = TRUE:

extract_eq(mod1, use_coefs = TRUE)
#> $$
#> \operatorname{mpg} = 34.66 - 1.59(\operatorname{cyl}) - 0.02(\operatorname{disp}) + \epsilon
#> $$

By default, it will remove doubled operators like “+ -”, but you can keep those in (which is often useful for teaching) with fix_signs = FALSE:

extract_eq(mod1, use_coefs = TRUE, fix_signs = FALSE)
#> $$
#> \operatorname{mpg} = 34.66 + -1.59(\operatorname{cyl}) + -0.02(\operatorname{disp}) + \epsilon
#> $$

This works in longer wrapped equations:

extract_eq(mod2, wrap = TRUE, terms_per_line = 3,
           use_coefs = TRUE, fix_signs = FALSE)
#> $$
#> \begin{aligned}
#> \operatorname{mpg} &= 12.3 + -0.11(\operatorname{cyl}) + 0.01(\operatorname{disp})\ + \\
#> &\quad -0.02(\operatorname{hp}) + 0.79(\operatorname{drat}) + -3.72(\operatorname{wt})\ + \\
#> &\quad 0.82(\operatorname{qsec}) + 0.32(\operatorname{vs}) + 2.52(\operatorname{am})\ + \\
#> &\quad 0.66(\operatorname{gear}) + -0.2(\operatorname{carb}) + \epsilon
#> \end{aligned}
#> $$

Beyond lm()

You’re not limited to just lm models! equatiomatic supports many other models, including logistic regression, probit regression, and ordered logistic regression (with MASS::polr()).

Logistic regression with glm()

library(palmerpenguins)

model_logit <- glm(sex ~ bill_length_mm + species, 
                   data = penguins, family = binomial(link = "logit"))
extract_eq(model_logit, wrap = TRUE, terms_per_line = 3)
#> $$
#> \begin{aligned}
#> \log\left[ \frac { P( \operatorname{sex} = \operatorname{male} ) }{ 1 - P( \operatorname{sex} = \operatorname{male} ) } \right] &= \alpha + \beta_{1}(\operatorname{bill\_length\_mm}) + \beta_{2}(\operatorname{species}_{\operatorname{Chinstrap}})\ + \\
#> &\quad \beta_{3}(\operatorname{species}_{\operatorname{Gentoo}}) + \epsilon
#> \end{aligned}
#> $$

Probit regression with glm()

model_probit <- glm(sex ~ bill_length_mm + species, 
                    data = penguins, family = binomial(link = "probit"))
extract_eq(model_probit, wrap = TRUE, terms_per_line = 3)
#> $$
#> \begin{aligned}
#> P(\operatorname{sex} = \operatorname{male}) &= \Phi[\alpha + \beta_{1}(\operatorname{bill\_length\_mm}) + \beta_{2}(\operatorname{species}_{\operatorname{Chinstrap}})\ + \\
#> &\qquad\ \beta_{3}(\operatorname{species}_{\operatorname{Gentoo}}) + \epsilon]
#> \end{aligned}
#> $$

Ordered logistic regression with MASS::polr()

set.seed(1234)
df <- data.frame(outcome = factor(rep(LETTERS[1:3], 100),
                                  levels = LETTERS[1:3],
                                  ordered = TRUE),
                 continuous_1 = rnorm(300, 100, 1),
                 continuous_2 = rnorm(300, 50, 5))

model_ologit <- MASS::polr(outcome ~ continuous_1 + continuous_2, 
                           data = df, Hess = TRUE, method = "logistic")
model_oprobit <- MASS::polr(outcome ~ continuous_1 + continuous_2, 
                            data = df, Hess = TRUE, method = "probit")

extract_eq(model_ologit, wrap = TRUE)
#> $$
#> \begin{aligned}
#> \log\left[ \frac { P( \operatorname{A} \geq \operatorname{B} ) }{ 1 - P( \operatorname{A} \geq \operatorname{B} ) } \right] &= \alpha_{1} + \beta_{1}(\operatorname{continuous\_1}) + \beta_{2}(\operatorname{continuous\_2}) + \epsilon \\
#> \log\left[ \frac { P( \operatorname{B} \geq \operatorname{C} ) }{ 1 - P( \operatorname{B} \geq \operatorname{C} ) } \right] &= \alpha_{2} + \beta_{1}(\operatorname{continuous\_1}) + \beta_{2}(\operatorname{continuous\_2}) + \epsilon
#> \end{aligned}
#> $$
extract_eq(model_oprobit, wrap = TRUE)
#> $$
#> \begin{aligned}
#> P(\operatorname{A} \geq \operatorname{B}) &= \Phi[\alpha_{1} + \beta_{1}(\operatorname{continuous\_1}) + \beta_{2}(\operatorname{continuous\_2}) + \epsilon] \\
#> P(\operatorname{B} \geq \operatorname{C}) &= \Phi[\alpha_{2} + \beta_{1}(\operatorname{continuous\_1}) + \beta_{2}(\operatorname{continuous\_2}) + \epsilon]
#> \end{aligned}
#> $$

Ordered regression (logit and probit) with ordinal::clm()

set.seed(1234)
df <- data.frame(outcome = factor(rep(LETTERS[1:3], 100),
                                  levels = LETTERS[1:3],
                                  ordered = TRUE),
                 continuous_1 = rnorm(300, 1, 1),
                 continuous_2 = rnorm(300, 5, 5))

model_ologit <- ordinal::clm(outcome ~ continuous_1 + continuous_2, 
                             data = df, link = "logit")
model_oprobit <- ordinal::clm(outcome ~ continuous_1 + continuous_2, 
                              data = df, link = "probit")

extract_eq(model_ologit, wrap = TRUE)
#> $$
#> \begin{aligned}
#> \log\left[ \frac { P( \operatorname{A} \geq \operatorname{B} ) }{ 1 - P( \operatorname{A} \geq \operatorname{B} ) } \right] &= \alpha_{1} + \beta_{1}(\operatorname{continuous\_1}) + \beta_{2}(\operatorname{continuous\_2}) + \epsilon \\
#> \log\left[ \frac { P( \operatorname{B} \geq \operatorname{C} ) }{ 1 - P( \operatorname{B} \geq \operatorname{C} ) } \right] &= \alpha_{2} + \beta_{1}(\operatorname{continuous\_1}) + \beta_{2}(\operatorname{continuous\_2}) + \epsilon
#> \end{aligned}
#> $$
extract_eq(model_oprobit, wrap = TRUE)
#> $$
#> \begin{aligned}
#> P(\operatorname{A} \geq \operatorname{B}) &= \Phi[\alpha_{1} + \beta_{1}(\operatorname{continuous\_1}) + \beta_{2}(\operatorname{continuous\_2}) + \epsilon] \\
#> P(\operatorname{B} \geq \operatorname{C}) &= \Phi[\alpha_{2} + \beta_{1}(\operatorname{continuous\_1}) + \beta_{2}(\operatorname{continuous\_2}) + \epsilon]
#> \end{aligned}
#> $$

Extension

This project is brand new. If you would like to contribute, we’d love your help! We are particularly interested in extending to more models. We hope to support any model supported by broom in the future.

Code of Conduct

Please note that the ‘equatiomatic’ project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

A note of appreciation

We’d like to thank the authors of the {palmerpenguin} dataset for generously allowing us to incorporate the penguins dataset in our package for example usage.

Horst AM, Hill AP, Gorman KB (2020). palmerpenguins: Palmer Archipelago (Antarctica) penguin data. R package version 0.1.0. https://allisonhorst.github.io/palmerpenguins/

Copy Link

Version

Install

install.packages('equatiomatic')

Monthly Downloads

1,242

Version

0.1.0

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Daniel Anderson

Last Published

August 27th, 2020

Functions in equatiomatic (0.1.0)

add_tex_ital_v

Wrap text in \operatorname{} (vectorized)
add_tex_subscripts

Wrap text in _{}
print.equation

Print 'LaTeX' equations
escape_tex

Escape TeX
add_tex_mult

Add multiplication symbol for interaction terms
create_term

Create a full term w/subscripts
modify_lhs_for_link

modifies lhs of equations that include a link function
extract_lhs

Generic function for extracting the left hand side from a model
create_eq.default

Create the full equation
extract_subscripts

Extract the subscripts from a given term
extract_rhs

Extract right-hand side
extract_lhs.clm

Extract left-hand side of a clm object
extract_lhs.glm

Extract left-hand side of a glm object
extract_all_subscripts

Extract all subscripts
wrap_rhs

Generic function for wrapping the RHS of a model equation in something, like how the RHS of probit is wrapped in <U+03C6>()
mapply_chr

extract_lhs.lm

Extract left-hand side of an lm object
extract_lhs.polr

Extract left-hand side of a polr object
fix_coef_signs

Deduplicate operators
anno_greek

Intermediary function to wrap text in \\beta_{}
extract_lhs2

Generic function for extracting the distribution-based left hand side from a model
add_tex_subscripts_v

Wrap text in _{}
detect_primary

Detect if a given term is part of a vector of full terms
equatiomatic-package

equatiomatic: Transform Models into 'LaTeX' Equations
extract_primary_term

Extract the primary terms from all terms
extract_eq

'LaTeX' code for R models
add_tex_ital

Wrap text in \operatorname{}
add_coefs.default

Add coefficient values to the equation
add_greek.default

Adds greek symbols to the equation
penguins

Size measurements for adult foraging penguins near Palmer Station, Antarctica