Learn R Programming

fulltext (version 0.1.8)

extract_tools: PDF extraction tools

Description

If you want to use ft_extract function, it currently has two options for how to extract text from PDFs: xpdf and ghostscript.

Arguments

xpdf installation

See http://www.foolabs.com/xpdf/download.html for instructions on how to download and install `xpdf`. For OSX, you an also get `xpdf` via Homebrew (https://github.com/homebrew/homebrew-x11/blob/master/xpdf.rb) with brew install xpdf. Apparently, you can optionally install Poppler, which is built on xpdf. Get it at http://poppler.freedesktop.org/

ghostscript installation

See http://www.ghostscript.com/doc/9.16/Install.htm for instructions on how to download and install `ghostscript`. For OSX, you an also get `ghostscript` via Homebrew (https://github.com/Homebrew/homebrew/blob/master/Library/Formula/ghostscript.rb) with brew install gs