rbison (version 1.0.0)

bison_solr: Search for and collect occurrence data from the USGS Bison API using their solr endpoint.

Description

This fxn is somewhat similar to bison(), but interacts with the SOLR interface https://bison.usgs.gov/#solr instead of the OpenSearch interface https://bison.usgs.gov/#opensearch, which bison() uses.

Usage

bison_solr(
  decimalLatitude = NULL,
  decimalLongitude = NULL,
  year = NULL,
  providerID = NULL,
  resourceID = NULL,
  pointPath = NULL,
  basisOfRecord = NULL,
  eventDate = NULL,
  computedCountyFips = NULL,
  computedStateFips = NULL,
  scientificName = NULL,
  hierarchy_homonym_string = NULL,
  TSNs = NULL,
  recordedBy = NULL,
  occurrenceID = NULL,
  catalogNumber = NULL,
  ITIScommonName = NULL,
  kingdom = NULL,
  collectorNumber = NULL,
  provider = NULL,
  ownerInstitutionCollectionCode = NULL,
  providedScientificName = NULL,
  ITISscientificName = NULL,
  providedCommonName = NULL,
  ITIStsn = NULL,
  centroid = NULL,
  higherGeographyID = NULL,
  providedCounty = NULL,
  calculatedCounty = NULL,
  stateProvince = NULL,
  calculatedState = NULL,
  countryCode = NULL,
  callopts = list(),
  verbose = TRUE,
  ...
)

Arguments

decimalLatitude

Geographic coordinate that specifies the north south position of a location on the Earth surface.

decimalLongitude

Geographic coordinate that specifies the east-west position of a location on the Earth surface.

year

The year the collection was taken.

providerID

(character) Unique identifier assigned by GBIF.

resourceID

(character) A unique identifier that is a concatentation of the provider identifier and the resource id seperated by a comma.

pointPath

A dynamic field that contains the location in longitude and latitude followed by the basis of record and an optional Geo (Spatial) precision. Geo (Spatial) precision is an added descriptor when the record is a county centroid.

basisOfRecord

One of these enumerated values: Observation, Germplasm, Fossil, Specimen, Literature, Unknown, or Living.

eventDate

The date when the occurrence was recorded. Dates need to be of the form YYYY-MM-DD

computedCountyFips

County FIPS code conforming to standard FIPS 6-4 but with leading zeros removed. See fips dataset for codes

computedStateFips

The normalized state fips code. See fips dataset for codes

scientificName

The species scientific name that is searchable in a case insensitive way.

hierarchy_homonym_string

hierarachy of the accepted or valid species name starting at kingdom. If the name is a taxonomic homonym more than one string is provided seperated by ';'.

TSNs

Accepted or valid name is provided. If the name is a taxonmic homonym more than one TSN is provided.

recordedBy

Individual responsible for the scientific record.

occurrenceID

Non-persistent unique identifier.

catalogNumber

Unique key for every record (occurrence/row) within a dataset that is not manipulated nor changed (nor generated, if not provided) during the data ingest.

ITIScommonName

Common name(s) from ITIS, e.g. "Canada goose"

kingdom

Kingdom name, from GBIF raw occurrence or BISON provider.

collectorNumber

An identifier given to the occurrence at the time it was recorded, such as a specimen collector's number. / e.g., "SJM030022".

provider

Non-persistent unique identifier.

ownerInstitutionCollectionCode

Name for the dataset, format = OwnerInstitution-Collection. / e.g., "USGS NAWQA BioData - Fish Occurrence Records"

providedScientificName

Full scientific name as provided in the dataset, with authorship and date information if known.

ITISscientificName

Scientific name from join on ITIS table, calculated e.g, "Bison bison"

providedCommonName

A list (concatenated and separated) of the available vernacular species names. / e.g., "common shrew, Masked Shrew"

ITIStsn

Phase II: ITIS TSN corresponding to clean_provided_scientific_name. May be invalid,unaccepted. Calculated. e.g., "3250", "05713"

centroid

Text string indicating that provided lat/lon point represents a polygon centroid. Text provides description of the centroid.

higherGeographyID

5-digit numeric text string geographic code for the state-county combination provided by data provider. / e.g,. "13029"

providedCounty

Full county, parish, or organized borough name, as provided in the dataset. If provided, Verbatum State is required. Is not changed during data ingest. / e.g., "Fairfax"

calculatedCounty

Full county, parish, or organized borough name of the occurrence calculated. / e.g., "Fairfax"

stateProvince

Full name of state or territory of the occurrence, as provided in the dataset.

calculatedState

U.S. State or territory name calculated. e.g., "Puerto Rico"

countryCode

The geographic location of the specific occurrence, expressed through a constrained vocabulary of countries using 2-letter ISO country code.

callopts

Further args passed on to crul::HttpClient() for HTTP debugging/inspecting. In bison, bison_providers, and bison_stats, ... is used instead of callopts, but ... is used here to pass additional Solr params.

verbose

Print message with url (TRUE, default).

...

Additional SOLR query arguments. See details.

Value

An object of class bison_solr - which is a list with slots for number of records found (num_found), records, highlight, or facets.

SOLR search parameters passed on via <code>...</code>

  • fl: Fields to return in the query

  • rows: Number of records to return

  • start: Record number to start at (an offset)

  • sort: Field to sort by, see examples

  • facet: Facet or not, logical

  • facet.field: Fields to facet by

To do pagination, use rows and start together

You can also use highlighting in solr search, but I'm not sure I see a use case for it with BISON data, though you can do it with this function.

For a tutorial see here http://lucene.apache.org/solr/3_6_2/doc-files/tutorial.html

Range searches

If you pass a vector of length 2 to a parameter we construct a range query for you. For example, c(4, 5) turns into [4 TO 5]. The [] syntax means the search is inclusive, meaning 4 to 5, including 4 and 5. Let us know if you think you need more flexible searching. That is, doing exclusive \{\} or mixed searches (\{] or [\}). Range searches can only be done with variables that are numeric/integer or dates or strings that can be coerced to dates. Dates need to be of the form YYYY-MM-DD

Details

Named parameters in this function are combined with AND and passed on to q SOLR parameter. Of course parameters can be more flexibly combined - let us know if you want that flexibility and we can think about that.

References

https://bison.usgs.gov/#solr

See Also

bison_tax(), bison()

The USGS BISON Solr installation version as of 2014-10-14 was 4.4.

Examples

Run this code
# NOT RUN {
x=bison_solr(scientificName='Ursus americanus')

bison_solr(scientificName='Ursus americanus', computedStateFips='02',
 fl="scientificName")

x <- bison_solr(scientificName='Ursus americanus', computedStateFips='02', rows=50)
x$points$computedStateFips
head(x$points)

bison_solr(ITISscientificName='Ursus americanus', rows=50)

bison_solr(providerID = 220)

# pagination
bison_solr(scientificName = 'Ursus americanus', rows = 10)
bison_solr(scientificName = 'Ursus americanus', rows = 10, start = 10)

# combining parameters
x <- bison_solr(eventDate = c('2008-01-01', '2010-12-31'), 
  ITISscientificName="Helianthus annuus", rows = 100)
head(x$points)
sort(x$points$eventDate)

# range queries
## range search with providerID
bison_solr(providerID = c(220, 221))
## date range search
x <- bison_solr(eventDate = c('2010-08-08', '2010-08-21'))
sort(x$points$eventDate)
## TSN range search
x <- bison_solr(TSNs = c(174773, 174775), rows = 100)
sort(x$points$TSN)
## can't do range searches with character strings (that are not dates)
# bison_solr(kingdom = c("Animalia", "Plantae"))

# more examples
bison_solr(TSNs = 174773)
bison_solr(occurrenceID = 576630651)
bison_solr(catalogNumber = 'OBS101299944')
bison_solr(ITIScommonName = "Canada goose")
bison_solr(kingdom = "Animalia")
bison_solr(kingdom = "Plantae")

# Mapping
out <- bison_solr(scientificName='Ursus americanus', rows=200)
bisonmap(out)

out <- bison_solr(scientificName='Helianthus annuus', rows=800)
bisonmap(out)

# Using additional solr fields
## Faceting
bison_solr(scientificName='Helianthus annuus', rows=0, facet='true',
 facet.field='computedStateFips')

## Highlighting
bison_solr(scientificName='Helianthus annuus', rows=10, hl='true',
 hl.fl='scientificName')

## Use of hierarchy_homonym_string
bison_solr(hierarchy_homonym_string = '-202423-914154-914156-158852-')
## -- This is a bit unwieldy, but you can find this string in the output 
## of a call, like this
x <- bison_solr(scientificName='Ursus americanus', rows=1)
string <- x$points$hierarchy_homonym_string
bison_solr(hierarchy_homonym_string = string)

# The pointPath parameter
bison_solr(pointPath = '/-110.0,45.0/specimen')

# Curl options
bison_solr(scientificName='Ursus americanus', callopts=list(verbose = TRUE))
# }

Run the code above in your browser using DataCamp Workspace