pdfx

pdfx_html

pdfx_targz

(character) Path to a file, or files on your machine. Required.

file

what

Further args passed to <code><a rd-options="httr" href="/link/GET?package=fulltext&version=0.1.6&to=httr" data-mini-rdoc="httr::GET">GET</a></code>. These aren't named, so just do e.g. ,
<code>verbose()</code>, or <code>timeout(3)</code>

input

write_path


Uses a web service provided by Utopia at 
<a href="http://pdfx.cs.man.ac.uk/">http://pdfx.cs.man.ac.uk/</a>. Beware, this can be quite slow. 
<code>pdfx</code> posts the pdf from your machine to the web service, 
<code>pdfx_html</code> takes the output of <code>pdfx</code> and gives back 
a html version of extracted text, and <code>pdfx_targz</code> 
gives a tar.gz version of the extracted text. This will not work
with PDFs that are scans of text, or mostly of images.


Provides a single interface to many sources of full text
'scholarly' data, including 'Biomed Central', Public Library of
Science, 'Pubmed Central', 'eLife', 'F1000Research', 'PeerJ',
'Pensoft', 'Hindawi', 'arXiv' 'preprints', and more. Functionality
included for searching for articles, downloading full or partial
text, downloading supplementary materials, converting to various
data formats used in and outside of R.

pdfx: PDF-to-XML conversion of scientific articles using pdfx

Description

Usage

Arguments

Value

Examples