Learn R Programming

RKorAPClient (version 1.1.0)

corpusQuery,KorAPConnection-method: Search corpus for query terms

Description

corpusQuery performs a corpus query via a connection to a KorAP-API-server

Usage

# S4 method for KorAPConnection
corpusQuery(
  kco,
  query = if (missing(KorAPUrl)) {
    
    stop("At least one of the parameters query and KorAPUrl must be specified.", call. =
    FALSE)
 } else {
     httr2::url_parse(KorAPUrl)$query$q
 },
  vc = if (missing(KorAPUrl)) "" else httr2::url_parse(KorAPUrl)$query$cq,
  KorAPUrl,
  metadataOnly = TRUE,
  ql = if (missing(KorAPUrl)) "poliqarp" else httr2::url_parse(KorAPUrl)$query$ql,
  fields = c("corpusSigle", "textSigle", "pubDate", "pubPlace", "availability",
    "textClass", "snippet", "tokens"),
  accessRewriteFatal = TRUE,
  verbose = kco@verbose,
  expand = length(vc) != length(query),
  as.df = FALSE,
  context = NULL
)

Value

Depending on the as.df parameter, a tibble or a KorAPQuery() object that, among other information, contains the total number of results in @totalResults. The resulting object can be used to fetch all query results (with fetchAll()) or the next page of results (with fetchNext()). A corresponding URL to be used within a web browser is contained in @webUIRequestUrl

Please make sure to check $collection$rewrites to see if any unforeseen access rewrites of the query's virtual corpus had to be performed.

Arguments

kco

KorAPConnection() object (obtained e.g. from KorAPConnection()

query

string that contains the corpus query. The query language depends on the ql parameter. Either query must be provided or KorAPUrl.

vc

string describing the virtual corpus in which the query should be performed. An empty string (default) means the whole corpus, as far as it is license-wise accessible.

KorAPUrl

instead of providing the query and vc string parameters, you can also simply copy a KorAP query URL from your browser and use it here (and in KorAPConnection) to provide all necessary information for the query.

metadataOnly

logical that determines whether queries should return only metadata without any snippets. This can also be useful to prevent access rewrites. Note that the default value is TRUE. If you want your corpus queries to return not only metadata, but also KWICS, you need to authorize your RKorAPClient application as explained in the authorization section of the RKorAPClient Readme on GitHub and set the metadataOnly parameter to FALSE.

ql

string to choose the query language (see section on Query Parameters in the Kustvakt-Wiki for possible values.

fields

character vector specifying which metadata fields to retrieve for each match. Available fields depend on the corpus. For DeReKo (German Reference Corpus), possible fields include:

Text identification:

textSigle, docSigle, corpusSigle - hierarchical text identifiers

Publication info:

author, editor, title, docTitle, corpusTitle - authorship and titles

Temporal data:

pubDate, creationDate - when text was published/created

Publication details:

pubPlace, publisher, reference - where/how published

Text classification:

textClass, textType, textTypeArt, textDomain, textColumn - topic domain, genre, text type and column

Adminstrative and technical info:

corpusEditor, availability, language, foundries - access rights and annotations

Content data:

snippet, tokens, tokenSource, externalLink - actual text content, tokenization, and link to source text

System data:

indexCreationDate, indexLastModified - corpus indexing info

Use c("textSigle", "pubDate", "author") to retrieve multiple fields. Default fields provide basic text identification and publication metadata. The actual text content (snippet and tokens) are activated by default if metadataOnly is set to FALSE.

accessRewriteFatal

abort if query or given vc had to be rewritten due to insufficient rights (not yet implemented).

verbose

print some info

expand

logical that decides if query and vc parameters are expanded to all of their combinations. Defaults to TRUE, iff query and vc have different lengths

as.df

return result as data frame instead of as S4 object?

context

string that specifies the size of the left and the right context returned in snippet (provided that metadataOnly is set to false and that the necessary access right are met). The format of the context size specifcation (e.g. 3-token,3-token) is described in the Service: Search GET documentation of the Kustvakt Wiki. If the parameter is not set, the default context size secification of the KorAP server instance will be used. Note that you cannot overrule the maximum context size set in the KorAP server instance, as this is typically legally motivated.

References

https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/9026

See Also

KorAPConnection(), fetchNext(), fetchRest(), fetchAll(), corpusStats()

Other corpus search functions: fetchAll,KorAPQuery-method, fetchNext,KorAPQuery-method