fulltext-package: Fulltext search and retrieval of scholarly texts.

Description

fulltext is a single interface to many sources of scholarly texts. In practice, this means only ones that are legally useable. We will support sources that require authentication on a case by case basis - that is, if more than just a few people will use it, and it's not too burdensome to include, then we can include that source.

Arguments

What's included

We currently include support for search and full text retrieval for a variety of publishers. See ft_search for what we include for search, and ft_get for what we include for full text retrieval.

Use cases

The following are tasks/use cases supported:

search - ft_search
get texts - ft_get
get full text links - ft_links
extract text from pdfs - ft_extract
serialize to different data formats - ft_serialize
extract certain article sections (e.g., authors) - chunks
grab supplementary materials for (re-)analysis of data - ft_get_si accepts article identifiers, and output from ft_search and ft_get

DOI delays

Beware that DOIs are not searchable via Crossref/Entrez immediately. The delay may be as much as a few days, though should be less than a day. This delay should become shorter as services improve. The point of this is that you man not find a match for a relatively new DOI (e.g., for an article published the same day). We've tried to account for this for some publishers. For example, for Crossref we search Crossref for a match for a DOI, and if none is found we attempt to retrieve the full text from the publisher directly.

Feedback

Let us know what you think at https://github.com/ropensci/fulltext/issues