col_search: Search Catalogue of Life for taxonomic IDs

Description

Search Catalogue of Life for taxonomic IDs

Usage

col_search(name = NULL, id = NULL, start = NULL, checklist = NULL,
  response = "terse", ...)

Arguments

name

The string to search for. Only exact matches found the name given will be returned, unless one or wildcards are included in the search string. An * (asterisk) character denotes a wildcard; a percent character may also be used. The name must be at least 3 characters long, not counting wildcard characters.

The record ID of the specific record to return (only for scientific names of species or infraspecific taxa)

start

The first record to return. If omitted, the results are returned from the first record (start=0). This is useful if the total number of results is larger than the maximum number of results returned by a single Web service query (currently the maximum number of results returned by a single query is 500 for terse queries and 50 for full queries).

checklist

The year of the checklist to query, if you want a specific year's checklist instead of the lastest as default (numeric). Options include 2007 to whatever the current year is. By default, the current year is used. Using 2014 and older we only give back an XML object the user can parse on their own

response

(character) one of "terse" or "full"

...

Curl options passed on to crul::HttpClient

Value

When checklist is 2015 or great, a list of data.frame's, named with the input vector of name's or id's, each data.frame has attributes you can access like attr(df, "error_message"):

id
name
total_number_of_results
number_of_results_returned
start
error_message
version
rank

If checklist is 2014 or less, COL did not provide JSON as a response format, so we return xml_document objects for each input name or id

Rate limiting

COL introduced rate limiting recently (writing this on 2019-11-14), but we've no information on what the rate limits are. If you do run into this you'll see an error like "Error: Too Many Requests (HTTP 429)", you'll need to time your requests to avoid the rate limiting, for example, by putting Sys.sleep() in between simultaneous requests.

Details

You must provide one of name or id. The other parameters (format and start) are optional.

References

http://webservice.catalogueoflife.org/

Examples

Run this code

# NOT RUN {
# A basic example
col_search(name="Apis")
col_search(name="Agapostemon")
col_search(name="Poa")

# Get full response, i.e., more data
col_search(name="Apis", response="full")
col_search(name="Poa", response="full")

# Many names
col_search(name=c("Apis","Puma concolor"))
col_search(name=c("Apis","Puma concolor"), response = "full")

# checklist year 2014 or earlier returns an xml_document
col_search(name="Agapostemon", checklist=2012)
col_search(name=c("Agapostemon", "Megachile"), checklist=2011)

# An example where there is no data
col_search(id = "36c623ad9e3da39c2e978fa3576ad415")
col_search(id = "36c623ad9e3da39c2e978fa3576ad415", response = "full")
col_search(id = "787ce23969f5188c2467126d9a545be1")
col_search(id = "787ce23969f5188c2467126d9a545be1", response = "full")
col_search(id = c("36c623ad9e3da39c2e978fa3576ad415",
  "787ce23969f5188c2467126d9a545be1"))
## a synonym
col_search(id = "f726bdaa5924cabf8581f99889de51fc")
col_search(id = "f726bdaa5924cabf8581f99889de51fc", response = "full")
# }

Run the code above in your browser using DataLab