htmltab: Hassle-free HTML tables in R
HTML tables are a valuable data source but extracting and recasting
these data into a useful format can be tedious. htmltab is a package for
extracting structured information from HTML tables. It is similar to
readHTMLTable()
of the XML package but provides two major advantages:
- First, the function automatically expands row and column spans in the header and body cells.
- Second, users are given more control over the identification of header and body rows which will end up in the R table.
Additionally, the function preprocesses table code, removes unneeded parts and so helps to alleviate the need for tedious post-processing.
Installation
You can install the released version of htmltab from CRAN with:
install.packages("htmltab")
And the development version from GitHub with:
# install.packages("remotes")
remotes::install_github("htmltab/htmltab")
Usage
To see htmltab in action, take a look at the case studies in this blog post, the package vignette, or the package manual.