Learn R Programming

polmineR (version 0.7.4)

cpos: Get corpus positions for a query or queries.

Description

Get matches for a query in a CQP corpus, optionally using the CQP syntax of the Corpus Workbench (CWB).

Usage

cpos(.Object, ...)

# S4 method for character cpos(.Object, query, pAttribute = getOption("polmineR.pAttribute"), cqp = is.cqp, encoding = NULL, verbose = TRUE, ...)

# S4 method for partition cpos(.Object, query, cqp = is.cqp, pAttribute = NULL, verbose = TRUE, ...)

# S4 method for tempcorpus cpos(.Object, query, shift = TRUE)

# S4 method for matrix cpos(.Object)

Arguments

.Object

a "character" vector indicating a CWB corpus, a "partition" object, a "tempcorpus" object, or a "matrix" with corpus positions

...

further arguments

query

a character vector providing one or multiple queries (token or CQP query)

pAttribute

the p-attribute to search. Needs to be stated only if query is not a CQP query. Defaults to NULL.

cqp

either logical (TRUE if query is a CQP query), or a function to check whether query is a CQP query or not (defaults to is.query auxiliary function)

encoding

the encoding of the corpus (if NULL, the encoding provided in the registry file of the corpus will be used)

verbose

logical, whether to be talkative

shift

logical, if true, the cpos resulting from the query performed on the tempcorpus will be shifted so that they match the positions of the corpus from which the tempcorpus was generated

Value

Unless .Object is a "matrix", you get a matrix with two columns, the first column giving the start cpos of the hits obtained, the second column giving the end cpos of the respective hit. The number of rows is the number of hits. If there are no hits, a NULL object will be returned.

Details

If the cpos-method is applied on "character", "partition", or "tempcorpus" object, the result is a two-column matrix with the regions (start end end corpus positions of the matches) for a query. CQP syntax can be used. The encoding of the query is adjusted to conform to the encoding of the CWB corpus.

If the cpos-method is called on a "matrix" object, the cpos matrix is unfolded.