Design Principles, Comparisons and Limitations

This document briefly describes the design of huxtable, and compares it with other R packages for creating tables. A current version is on the web in HTML or PDF formats.

Design principles

I wrote this package because I wanted a simple way to create tables in my LaTeX documents. At the same time, I wanted to be able to output HTML or Markdown for use in RStudio. And, I wanted to be able to edit tables intuitively using standard R features. My typical use case is creating tables of regression outputs, but I also wanted to be able to represent arbitrary data, like a table of descriptive statistics or of plain text.

The idea behind huxtable is to store data in a normal data frame, along with properties that describe how to display the data, at cell, column, row or table level. Operations on the data frame work as normal, and they also affect the display properties. Then, the data can be output in an appropriate format. At the moment, those formats are LaTeX, HTML, Markdown and on-screen pretty printing, but more could be added.

Another design choice was to have separate functions per feature. Many existing packages use a single function with a large number of options. For instance, print.xtable in the xtable package has 34 options, and texreg in the texreg package has 41. Having one function per feature should make life easier for the end user. It should also lead to clearer code: each function starts with a valid huxtable, changes one thing, and returns a valid huxtable.

The output formats are very different, and decisions have to be made as to what any package will support. My background is more in HTML. This is reflected in some of the huxtable properties, like per-cell borders and padding. The package tries to keep output reasonably similar between LaTeX and HTML, but there are inevitably some differences and limitations (see below). For Markdown and on-screen output, obviously, only a few basic properties are supported.

The package makes no attempt to output beautiful HTML or LaTeX source code. In fact, in the case of LaTeX, it's pretty ugly. The approach is "do what it takes to get the job done".

Comparing Huxtable With Other Packages

R has many different packages to create LaTeX and HTML tables. The table(s) below list those I know and the features they have. The table is produced with huxtable, of course ;-)

suppressPackageStartupMessages(library(huxtable)) is_latex <- guess_knitr_output_format() == 'latex' comp <- read.csv('comparison.csv', stringsAsFactors = FALSE, header = FALSE) ch <- as_hux(comp) bold(ch)[1,] <- TRUE bottom_border(ch)[1,] <- 1 subsections <- ch[[1]] %in% c('HTML output', 'LaTeX output', 'Other features', 'Other formats', 'Notes') top_border(ch)[subsections, ] <- 1 bold(ch)[subsections, 1] <- TRUE italic(ch)[subsections, 1] <- TRUE background_color(ch)[, seq(3, ncol(ch), 2)] <- grey(.95) background_color(ch)[, 2] <- 'lightpink' rotation(ch)[1,] <- 270 valign(ch)[1,] <- 'middle' align(ch)[-1, -1] <- 'center' ch <- set_all_padding(ch, -1, everywhere, 0) ch <- rbind(ch, rep('', ncol(ch))) last <- nrow(ch) ch[last, 1] <- 'A (Y) means that there is limited support for the feature. For example, multirow cells may only be supported in headers, or only horizontal border lines may work.' font_size(ch)[last, 1] <- 10 colspan(ch)[last, 1] <- ncol(ch) bold(ch)[last, 1] <- FALSE italic(ch)[last, 1] <- FALSE bottom_border(ch)[last, 1] <- 2 wrap(ch) <- TRUE if (is_latex) { row_height(ch) <- c('20pt', rep('10pt', nrow(ch) - 1)) col_width(ch) <- c('120pt', rep('36pt', ncol(ch) - 1)) height(ch) <- 0.95 position(ch) <- 'left' font_size(ch) <- 10 font_size(ch)[c(last - 1, last), 1] <- 8 ch1 <- ch[,1:8] ch2 <- ch[,c(1, 9:15)] caption(ch1) <- 'Comparison table, part 1' caption(ch2) <- 'Comparison table, part 2' } else { row_height(ch) <- c('80pt', rep('14pt', nrow(ch) - 1)) col_width(ch) <- c('60pt', rep('20pt', ncol(ch) - 1)) } if (! is_latex) ch if (is_latex) ch1 if (is_latex) ch2

This comparison doesn't necessarily tell you the important stuff: how easy is the interface? Is the code currently maintained? I have not used all these packages, but my personal (and subjective) recommendations are:

  • texreg is very good for producing regression tables. It can cope with a huge variety of inputs. Hopefully, with the advent of broom, it will get easier for many packages to do this.
  • ztable seems to support a lot of functionality, though I haven't used it.
  • xtable is old, but reliable and widely available.
  • tables has an interesting interface for producing summary statistics. It looks complex but powerful.
  • pixiedust is quite close to huxtable. It has many features and is well-written. The interface is slightly different: you use sprinkle() to add features to a tidy data frame.
  • Lastly, formattable is a new kid on the block. It has well-written code and some interesting ideas. It's HTML-only at present.

Limitations

Some people love LaTeX. Other people think they have to use it to be "scientific". (Sadly, this is all too common in my field.) Personally, I can tolerate it at a distance. It's certainly not easy to produce LaTeX code combining a wide variety of table features. Current limitations of huxtable include:

  • You can't change horizontal border widths within a single line.
  • Table width may be unpredictable and your text may spill over. Adjust on a trial and error basis.
  • If table position is problematic, check that your table is wide enough, and set width() if necessary.
  • Vertical alignment is unlikely to work as you might expect.

Some of these may be fixed... at some point!

There are also limitations in HTML:

  • Rotation is likely to mess up your cells. This won't be fixed until there is a simple way to do cell rotation in CSS. (There are plenty of complicated ways.)

Lastly, as mentioned above, HTML and LaTeX output is likely to differ. For example, in LaTeX, height is set by putting the table inside a \resizebox object. That can obviously mess up other sizes like table width. Your Mileage May Vary.

Feel free to report bugs at github, or to email me (davidhughjones@gmail.com) with brickbats and bouquets.