In a CWB corpus, every token has positional attributes. While s-attributes
cover a range of tokens, every single token in the token stream of a corpus
will have a set of positional attributes (such as part-of-speech, or lemma).
The available p-attributes are returned by the p_attributes
-method.
p_attributes(.Object, ...)# S4 method for character
p_attributes(.Object, p_attribute = NULL)
# S4 method for corpus
p_attributes(.Object, p_attribute = NULL)
# S4 method for slice
p_attributes(.Object, p_attribute = NULL, decode = TRUE)
# S4 method for partition_bundle
p_attributes(.Object, p_attribute = NULL, decode = TRUE)
# S4 method for remote_corpus
p_attributes(.Object, ...)
# S4 method for remote_partition
p_attributes(.Object, ...)
A length-one character
vector, or a partition
object.
Arguments passed to get_token_stream
.
A p-attribute to decode, provided by a length-one
character
vector.
A length-one logical
value. Whether to return decoded
p-attributes or unique token ids.
The p_attributes
-method returns the p-attributes defined for the
corpus the partition is derived from, if argument p_attribute
is
NULL
(the default). If p_attribute
is defined, the unique
values for the p-attribute are returned.
Stefan Evert & The OCWB Development Team, CQP Query Language Tutorial, https://cwb.sourceforge.io/files/CQP_Tutorial.pdf.
use(pkg = "RcppCWB", corpus = "REUTERS")
p_attributes("REUTERS")
p_attributes("REUTERS", p_attribute = "word")
merkel <- partition("GERMAPARLMINI", speaker = "Merkel", regex = TRUE)
merkel_words <- p_attributes(merkel, "word")
Run the code above in your browser using DataLab