fulltext is a single interface to many sources of scholarly texts. In practice, this means only ones that are legally useable. We will support sources that require authentication on a case by case basis - that is, if more than just a few people will use it, and it's not too burdensome to include, then we can include that source.
We currently include support for search and full text retrieval for a variety
of publishers. See ft_search
for what we include for search, and
ft_get
for what we include for full text retrieval.
The following are tasks/use cases supported:
search - ft_search
get texts - ft_get
get full text links - ft_links
extract text from pdfs - ft_extract
serialize to different data formats - ft_serialize
extract certain article sections (e.g., authors) - chunks
grab supplementary materials for (re-)analysis of data - ft_get_si
accepts article identifiers, and output from ft_search
and
ft_get
Beware that DOIs are not searchable via Crossref/Entrez immediately. The delay may be as much as a few days, though should be less than a day. This delay should become shorter as services improve. The point of this is that you man not find a match for a relatively new DOI (e.g., for an article published the same day). We've tried to account for this for some publishers. For example, for Crossref we search Crossref for a match for a DOI, and if none is found we attempt to retrieve the full text from the publisher directly.
Let us know what you think at https://github.com/ropensci/fulltext/issues