solr (version 0.1.0)

solr_facet: Do faceted searches, outputing facets only.

Description

Do faceted searches, outputing facets only.

Usage

solr_facet(q = "*:*", facet.query = NA, facet.field = NA,
  facet.prefix = NA, facet.sort = NA, facet.limit = NA,
  facet.offset = NA, facet.mincount = NA, facet.missing = NA,
  facet.method = NA, facet.enum.cache.minDf = NA, facet.threads = NA,
  facet.date = NA, facet.date.start = NA, facet.date.end = NA,
  facet.date.gap = NA, facet.date.hardend = NA, facet.date.other = NA,
  facet.date.include = NA, facet.range = NA, facet.range.start = NA,
  facet.range.end = NA, facet.range.gap = NA, facet.range.hardend = NA,
  facet.range.other = NA, facet.range.include = NA, start = NA,
  rows = NA, key = NA, url = NA, wt = "json", raw = FALSE,
  callopts = list(), ...)

Arguments

q
Query terms. See examples.
facet.query
This param allows you to specify an arbitrary query in the Lucene default syntax to generate a facet count. By default, faceting returns a count of the unique terms for a "field", while facet.query allows you to determine counts for arbitrary term
facet.field
This param allows you to specify a field which should be treated as a facet. It will iterate over each Term in the field and generate a facet count using that Term as the constraint. This parameter can be specified multiple times to indicate multi
facet.prefix
Limits the terms on which to facet to those starting with the given string prefix. Note that unlike fq, this does not change the search results -- it merely reduces the facet values returned to those beginning with the specified prefix. This param
facet.sort
See Details.
facet.limit
This param indicates the maximum number of constraint counts that should be returned for the facet fields. A negative value means unlimited. Default: 100. Can be specified on a per field basis.
facet.offset
This param indicates an offset into the list of constraints to allow paging. Default: 0. This parameter can be specified on a per field basis.
facet.mincount
This param indicates the minimum counts for facet fields should be included in the response. Default: 0. This parameter can be specified on a per field basis.
facet.missing
Set to "true" this param indicates that in addition to the Term based constraints of a facet field, a count of all matching results which have no value for the field should be computed. Default: FALSE. This parameter can be specified on a per fiel
facet.method
See Details.
facet.enum.cache.minDf
This param indicates the minimum document frequency (number of documents matching a term) for which the filterCache should be used when determining the constraint count for that term. This is only used when facet.method=enum method of faceting. A
facet.threads
This param will cause loading the underlying fields used in faceting to be executed in parallel with the number of threads specified. Specify as facet.threads=# where # is the maximum number of threads used. Omitting this parameter or specifying t
facet.date
Specify names of fields (of type DateField) which should be treated as date facets. Can be specified multiple times to indicate multiple date facet fields.
facet.date.start
The lower bound for the first date range for all Date Faceting on this field. This should be a single date expression which may use the DateMathParser syntax. Can be specified on a per field basis.
facet.date.end
The minimum upper bound for the last date range for all Date Faceting on this field (see facet.date.hardend for an explanation of what the actual end value may be greater). This should be a single date expression which may use the DateMathParser s
facet.date.gap
The size of each date range expressed as an interval to be added to the lower bound using the DateMathParser syntax. Eg: facet.date.gap= field basis.
facet.date.hardend
A Boolean parameter instructing Solr what to do in the event that facet.date.gap does not divide evenly between facet.date.start and facet.date.end. If this is true, the last date range constraint will have an upper bound of facet.date.end; if f
facet.date.other
See Details.
facet.date.include
See Details.
facet.range
Indicates what field to create range facets for. Example: facet.range=price&facet.range=age
facet.range.start
The lower bound of the ranges. Can be specified on a per field basis. Example: f.price.facet.range.start=0.0&f.age.facet.range.start=10
facet.range.end
The upper bound of the ranges. Can be specified on a per field basis. Example: f.price.facet.range.end=1000.0&f.age.facet.range.start=99
facet.range.gap
The size of each range expressed as a value to be added to the lower bound. For date fields, this should be expressed using the DateMathParser syntax. (ie: facet.range.gap= specified on a per field basis. Example: f.price.facet.range.gap=100&f.ag
facet.range.hardend
A Boolean parameter instructing Solr what to do in the event that facet.range.gap does not divide evenly between facet.range.start and facet.range.end. If this is true, the last range constraint will have an upper bound of facet.range.end; if fa
facet.range.other
See Details.
facet.range.include
See Details.
start
Record to start at, default to beginning.
rows
Number of records to return.
key
API key, if needed.
url
URL endpoint
wt
(character) Data format to return. One of xml or json (default).
raw
(logical) If TRUE (default) raw json or xml returned. If FALSE, parsed data returned.
callopts
Call options passed on to httr::GET
...
Further args, usually per field arguments for faceting.

Value

  • Raw json or xml, or a list of length 4 parsed elements (usually data.frame's).

strong

  • facet.method
  • facet.date.other
  • facet.date.include
  • facet.date.include
  • facet.range.include

itemize

  • lower

item

  • fc
  • fcs
  • after
  • between
  • none
  • all
  • upper
  • edge
  • outer
  • all
  • after
  • between
  • none
  • all
  • upper
  • edge
  • outer
  • all

Details

A number of fields can be specified multiple times, in which case you can separate them by commas, like facet.field='journal,subject'. Those fields are:
  • facet.field
  • facet.query
  • facet.date
  • facet.date.other
  • facet.date.include
  • facet.range
  • facet.range.other
  • facet.range.include

Options for some parameters:

facet.sort: This param determines the ordering of the facet field constraints.

  • count
sort the constraints by count (highest count first) index to return the constraints sorted in their index order (lexicographic by indexed term). For terms in the ascii range, this will be alphabetically sorted.

References

See http://wiki.apache.org/solr/SimpleFacetParameters for more information on faceting.

See Also

solr_search, solr_highlight, solr_parse

Examples

Run this code
url <- 'http://api.plos.org/search'; key = getOption('PlosApiKey')

# Facet on a single field
solr_facet(q='*:*', facet.field='journal', url=url, key=key)

# Facet on multiple fields
solr_facet(q='alcohol', facet.field='journal,subject', url=url, key=key)

# Using mincount
solr_facet(q='alcohol', facet.field='journal', facet.mincount='500', url=url, key=key)

# Using facet.query to get counts
solr_facet(q='*:*', facet.field='journal', facet.query='cell,bird', url=url, key=key)

# Date faceting
solr_facet(q='*:*', url=url, facet.date='publication_date',
facet.date.start='NOW/DAY-5DAYS', facet.date.end='NOW', facet.date.gap='+1DAY', key=key)

# Range faceting
solr_facet(q='*:*', url=url, facet.range='counter_total_all',
facet.range.start=5, facet.range.end=1000, facet.range.gap=10, key=key)

# Range faceting with > 1 field, same settings
solr_facet(q='*:*', url=url, facet.range='counter_total_all,alm_twitterCount',
facet.range.start=5, facet.range.end=1000, facet.range.gap=10, key=key)

# Range faceting with > 1 field, different settings
solr_facet(q='*:*', url=url, facet.range='counter_total_all,alm_twitterCount',
f.counter_total_all.facet.range.start=5, f.counter_total_all.facet.range.end=1000,
f.counter_total_all.facet.range.gap=10, f.alm_twitterCount.facet.range.start=5,
f.alm_twitterCount.facet.range.end=1000, f.alm_twitterCount.facet.range.gap=10, key=key)

# Get raw json or xml
## json
solr_facet(q='*:*', facet.field='journal', url=url, key=key, raw=TRUE)
## xml
solr_facet(q='*:*', facet.field='journal', url=url, key=key, raw=TRUE, wt='xml')

# Get raw data back, and parse later, same as what goes on internally if
# raw=FALSE (Default)
out <- solr_facet(q='*:*', facet.field='journal', url=url, key=key, raw=TRUE)
solr_parse(out)
out <- solr_facet(q='*:*', facet.field='journal', url=url, key=key, raw=TRUE,
   wt='xml')
solr_parse(out)

# Using the USGS BISON API (http://bison.usgs.ornl.gov/services.html#solr)
## The occurrence endpoint
url="http://bisonapi.usgs.ornl.gov/solr/occurrences/select"
solr_facet(q='*:*', facet.field='year', url=url)
solr_facet(q='*:*', facet.field='state_code', url=url)
solr_facet(q='*:*', facet.field='basis_of_record', url=url)

Run the code above in your browser using DataLab