name_lookup: Lookup names in all taxonomies in GBIF.

Description

Lookup names in all taxonomies in GBIF.

This service uses fuzzy lookup so that you can put in partial names and you should get back those things that match. See examples below.

Faceting: If facet=FALSE or left to the default (NULL), no faceting is done. And therefore, all parameters with facet in their name are ignored (facetOnly, facetMincount, facetMultiselect).

Usage

name_lookup(query = NULL, rank = NULL, higherTaxonKey = NULL, status = NULL, isExtinct = NULL, habitat = NULL, nameType = NULL, datasetKey = NULL, nomenclaturalStatus = NULL, limit = 100, start = NULL, facet = NULL, facetMincount = NULL, facetMultiselect = NULL, type = NULL, hl = NULL, verbose = FALSE, return = "all", ...)

Arguments

query

Query term(s) for full text search.

rank

CLASS, CULTIVAR, CULTIVAR_GROUP, DOMAIN, FAMILY, FORM, GENUS, INFORMAL, INFRAGENERIC_NAME, INFRAORDER, INFRASPECIFIC_NAME, INFRASUBSPECIFIC_NAME, KINGDOM, ORDER, PHYLUM, SECTION, SERIES, SPECIES, STRAIN, SUBCLASS, SUBFAMILY, SUBFORM, SUBGENUS, SUBKINGDOM, SUBORDER, SUBPHYLUM, SUBSECTION, SUBSERIES, SUBSPECIES, SUBTRIBE, SUBVARIETY, SUPERCLASS, SUPERFAMILY, SUPERORDER, SUPERPHYLUM, SUPRAGENERIC_NAME, TRIBE, UNRANKED, VARIETY

higherTaxonKey

Filters by any of the higher Linnean rank keys. Note this is within the respective checklist and not searching nub keys across all checklists.

status

Filters by the taxonomic status as one of:

ACCEPTED
DETERMINATION_SYNONYM Used for unknown child taxa referred to via spec, ssp, ...
DOUBTFUL Treated as accepted, but doubtful whether this is correct.
HETEROTYPIC_SYNONYM More specific subclass of SYNONYM.
HOMOTYPIC_SYNONYM More specific subclass of SYNONYM.
INTERMEDIATE_RANK_SYNONYM Used in nub only.
MISAPPLIED More specific subclass of SYNONYM.
PROPARTE_SYNONYM More specific subclass of SYNONYM.
SYNONYM A general synonym, the exact type is unknown.

isExtinct

(logical) Filters by extinction status (e.g. isExtinct=TRUE)

habitat

(character) Filters by habitat. One of: marine, freshwater, or terrestrial

nameType

Filters by the name type as one of:

BLACKLISTED surely not a scientific name.
CANDIDATUS Candidatus is a component of the taxonomic name for a bacterium that cannot be maintained in a Bacteriology Culture Collection.
CULTIVAR a cultivated plant name.
DOUBTFUL doubtful whether this is a scientific name at all.
HYBRID a hybrid formula (not a hybrid name).
INFORMAL a scientific name with some informal addition like "cf." or indetermined like Abies spec.
SCINAME a scientific name which is not well formed.
VIRUS a virus name.
WELLFORMED a well formed scientific name according to present nomenclatural rules.

datasetKey

Filters by the dataset's key (a uuid)

nomenclaturalStatus

Not yet implemented, but will eventually allow for filtering by a nomenclatural status enum

limit

Number of records to return. Maximum: 1000.

start

Record number to start at.

facet

A list of facet names used to retrieve the 100 most frequent values for a field. Allowed facets are: datasetKey, higherTaxonKey, rank, status, isExtinct, habitat, and nameType. Additionally threat and nomenclaturalStatus are legal values but not yet implemented, so data will not yet be returned for them.

facetMincount

Used in combination with the facet parameter. Set facetMincount=# to exclude facets with a count less than #, e.g. http://bit.ly/1bMdByP only shows the type value 'ACCEPTED' because the other statuses have counts less than 7,000,000

facetMultiselect

(logical) Used in combination with the facet parameter. Set facetMultiselect=TRUE to still return counts for values that are not currently filtered, e.g. http://bit.ly/19YLXPO still shows all status values even though status is being filtered by status=ACCEPTED

type

Type of name. One of occurrence, checklist, or metadata.

(logical) Set hl=TRUE to highlight terms matching the query when in fulltext search fields. The highlight will be an emphasis tag of class 'gbifH1' e.g. query='plant', hl=TRUE. Fulltext search fields include: title, keyword, country, publishing country, publishing organization title, hosting organization title, and description. One additional full text field is searched which includes information from metadata documents, but the text of this field is not returned in the response.

verbose

(logical) If TRUE, all data is returned as a list for each element. If FALSE (default) a subset of the data that is thought to be most essential is organized into a data.frame.

return

One of data, meta, facets, names, or all. If data, a data.frame with the data. facets returns the facets, if facets=TRUE, or empy list if facets=FALSE. meta returns the metadata for the entire call. names returns the vernacular (common) names for each taxon. all gives all data back in a list. Each element is NULL if there is no contents in that element. hierarchies and names slots are named by the GBIF key, which matches the first column of the data.frame in the data slot. So if you wanted to combine those somehow, you could easily do so using the key.

...

Further named parameters, such as query, path, etc, passed on to modify_url within GET call. Unnamed parameters will be combined with config.

Value

A list of length three. The first element is metadata. The second is either a data.frame (verbose=FALSE, default) or a list (verbose=TRUE), and the third element is the facet data.

References

http://www.gbif.org/developer/species#searching

Examples

Run this code

## Not run: 
# # Look up names like mammalia
# name_lookup(query='mammalia', limit = 20)
# 
# # Paging
# name_lookup(query='mammalia', limit=1)
# name_lookup(query='mammalia', limit=1, start=2)
# 
# # large requests, use start parameter
# first <- name_lookup(query='mammalia', limit=1000)
# second <- name_lookup(query='mammalia', limit=1000, start=1000)
# tail(first$data)
# head(second$data)
# first$meta
# second$meta
# 
# # Get all data and parse it, removing descriptions which can be quite long
# out <- name_lookup('Helianthus annuus', rank="species", verbose=TRUE)
# lapply(out$data, function(x) x[!names(x) %in% c("descriptions","descriptionsSerialized")])
# 
# # Search for a genus, returning just data
# name_lookup(query='Cnaemidophorus', rank="genus", return="data")
# 
# # Just metadata
# name_lookup(query='Cnaemidophorus', rank="genus", return="meta")
# 
# # Just hierarchies
# name_lookup(query='Cnaemidophorus', rank="genus", return="hierarchy")
# 
# # Just vernacular (common) names
# name_lookup(query='Cnaemidophorus', rank="genus", return="names")
# 
# # Fuzzy searching
# name_lookup(query='Cnaemidophor', rank="genus")
# 
# # Limit records to certain number
# name_lookup('Helianthus annuus', rank="species", limit=2)
# 
# # Query by habitat
# name_lookup(habitat = "terrestrial", limit=2)
# name_lookup(habitat = "marine", limit=2)
# name_lookup(habitat = "freshwater", limit=2)
# 
# # Using faceting
# name_lookup(facet='status', limit=0, facetMincount='70000')
# name_lookup(facet=c('status','higherTaxonKey'), limit=0, facetMincount='700000')
# 
# name_lookup(facet='nameType', limit=0)
# name_lookup(facet='habitat', limit=0)
# name_lookup(facet='datasetKey', limit=0)
# name_lookup(facet='rank', limit=0)
# name_lookup(facet='isExtinct', limit=0)
# 
# name_lookup(isExtinct=TRUE, limit=0)
# 
# # text highlighting
# ## turn on highlighting
# res <- name_lookup(query='canada', hl=TRUE, limit=5)
# res$data
# name_lookup(query='canada', hl=TRUE, limit=45, return='data')
# ## and you can pass the output to gbif_names() function
# res <- name_lookup(query='canada', hl=TRUE, limit=5)
# gbif_names(res)
# 
# # Lookup by datasetKey
# name_lookup(datasetKey='3f8a1297-3259-4700-91fc-acc4170b27ce')
# 
# # Pass on httr options
# library('httr')
# name_lookup(query='Cnaemidophorus', rank="genus", config=verbose())
# ## End(Not run)

Run the code above in your browser using DataLab