- es_host
A string identifying an Elasticsearch host. This should be of the form
[transfer_protocol][hostname]:[port]. For example, 'http://myindex.thing.com:9200'.
- es_index
The name of an Elasticsearch index to be queried. Note that passing
NULL is not supported. Technically, not passing an index
to Elasticsearch is legal and results in searching over all indexes.
To be sure that this very expensive query is not executed by accident,
uptasticsearch forbids this. If you want to execute a query over
all indexes in the cluster, set this argument to "_all".
- size
Number of records per page of results.
See Elasticsearch docs for more.
Note that this will be reset to 0 if you submit a query_body with
an "aggs" request in it. Also see max_hits.
- query_body
String with a valid Elasticsearch query. Default is an empty query.
- scroll
How long should the scroll context be held open? This should be a
duration string like "1m" (for one minute) or "15s" (for 15 seconds).
The scroll context will be refreshed every time you ask Elasticsearch
for another record, so this parameter should just be the amount of
time you expect to pass between requests. See the
Elasticsearch scroll/pagination docs
for more information.
- max_hits
Integer. If specified, es_search will stop pulling data as soon
as it has pulled this many hits. Default is Inf, meaning that
all possible hits will be pulled.
- n_cores
Number of cores to distribute fetching and processing over.
- break_on_duplicates
Boolean, defaults to TRUE. es_search uses the size of the
final object it returns to check whether or not some data were lost
during the processing. If you have duplicates in the source data, you
will have to set this flag to FALSE and just trust that no data have
been lost. Sorry :( .
- ignore_scroll_restriction
There is a cost associated with keeping an
Elasticsearch scroll context open. By default,
this function does not allow arguments to scroll
which exceed one hour. This is done to prevent
costly mistakes made by novice Elasticsearch users.
If you understand the cost of keeping the context
open for a long time and would like to pass a scroll
value longer than an hour, set ignore_scroll_restriction
to TRUE.
- intermediates_dir
When scrolling over search results, this function writes
intermediate results to disk. By default, `es_search` will create a temporary
directory in whatever working directory the function is called from. If you
want to change this behavior, provide a path here. `es_search` will create
and write to a temporary directory under whatever path you provide.
- verbose
TRUE if verbose logs should be printed. FALSE by default.