Learn R Programming

tableParser (version 1.0.2)

Parse Tabled Content to Text Vector and Extract Statistical Standard Results

Description

Features include the ability to extract tabled content from NISO-JATS-coded XML, any native HTML or HML file, DOCX, and PDF documents, and then collapse it into a text format that is readable by humans by mimicking the actions of a screen reader. As tables within PDF documents are extracted with the 'tabulapdf' package, and the table captions and footnotes cannot be extracted, the results on tables within PDF documents have to be considered less precise. The function table2matrix() returns a list of the tables within a document as character matrices. table2text() collapses the matrix content into a list of character strings by imitating the behavior of a screen reader. The textual representation of characters and numbers can be unified with unifyMatrix() before parsing. The function table2stats() extracts the tabled statistical test results from the collapsed text with the function standardStats() from the 'JATSdecoder' package and, if activated, checks the reported and coded p-values for consistency. Due to the great variability and potential complexity of table structures, parsing accuracy may vary.

Copy Link

Version

Install

install.packages('tableParser')

Version

1.0.2

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Ingmar Böschen

Last Published

February 2nd, 2026

Functions in tableParser (1.0.2)

tableClass

tableClass
unifyMatrixContent

unifyMatrixContent
get.footer

get.footer
matrix2text

matrix2text
docx2matrix

docx2matrix
guessCaptionFootnote

guessCaptionFootnote
get.HTML.tables

get.HTML.tables
prepareMatrix

prepareMatrix
get.caption

get.caption
legendCodings

legendCodings
parseMatrixContent

parseMatrixContent
html2unicode

html2unicode
table2text

table2text
table2stats

table2stats
unifyStats

unifyStats
reexports

Objects exported from other packages
table2matrix

table2matrix