Learn R Programming

polmineR (version 0.7.11)

cpos: Get corpus positions for a query or queries.

Description

Get matches for a query in a CQP corpus, optionally using the CQP syntax of the Corpus Workbench (CWB).

Usage

cpos(.Object, ...)

# S4 method for character cpos(.Object, query, p_attribute = getOption("polmineR.p_attribute"), cqp = is.cqp, check = TRUE, encoding = NULL, verbose = TRUE, ...)

# S4 method for partition cpos(.Object, query, cqp = is.cqp, check = TRUE, p_attribute = NULL, verbose = TRUE, ...)

# S4 method for tempcorpus cpos(.Object, query, shift = TRUE)

# S4 method for matrix cpos(.Object)

# S4 method for hits cpos(.Object)

Arguments

.Object

A character vector indicating a CWB corpus, a partition object, a tempcorpus object, or a matrix with corpus positions.

...

Used for reasons of backwards compatibility to process arguments that have been renamed (e.g. pAttribute).

query

A character vector providing one or multiple queries (token or CQP query)

p_attribute

The p-attribute to search. Needs to be stated only if query is not a CQP query. Defaults to NULL.

cqp

Either logical (TRUE if query is a CQP query), or a function to check whether query is a CQP query or not (defaults to is.cqp auxiliary function).

check

A logical value, whether to check validity of CQP query using check_cqp_query.

encoding

The encoding of the corpus (if NULL, the encoding stated in the registry file of the corpus will be used),

verbose

A logical value, whether to show messages.

shift

logical, if true, the cpos resulting from the query performed on the tempcorpus will be shifted so that they match the positions of the corpus from which the tempcorpus was generated

Value

Unless .Object is a matrix, the return value is a matrix with two columns. The first column reports the left/starting corpus positions (cpos) of the hits obtained. The second column reports the right/ending corpus positions of the respective hit. The number of rows is the number of hits. If there are no hits, a NULL object is returned.

Details

If the cpos-method is applied on "character", "partition", or "tempcorpus" object, the result is a two-column matrix with the regions (start end end corpus positions of the matches) for a query. CQP syntax can be used. The encoding of the query is adjusted to conform to the encoding of the CWB corpus.

If the cpos-method is called on a matrix object, the cpos matrix is unfolded, the return value is an integer vector with the individual corpus positions. Equally, if .Object is a hits object, an integer vector is returned with the individual corpus positions.