
Last chance! 50% off unlimited learning
Sale ends in
Conveniently extract features from annotations and annotated plain text documents.
features(x, type = NULL, simplify = TRUE)
an object inheriting from class "Annotation"
or
"AnnotatedPlainTextDocument"
.
a character vector of annotation types to be used for
selecting annotations, or NULL
(default) to use all
annotations. When selecting, the elements of type
will
partially be matched against the annotation types.
a logical indicating whether to simplify feature values to a vector.
features()
conveniently gathers all feature tag-value pairs in
the selected annotations into a data frame with variables the values
for all tags found (using a NULL
value for tags without a
value). In general, variables will be lists of extracted
values. By default, variables where all elements are length one
atomic vectors are simplified into an atomic vector of values. The
values for specific tags can be extracted by suitably subscripting the
obtained data frame.
# NOT RUN {
## Use a pre-built annotated plain text document,
## see ? AnnotatedPlainTextDocument.
doc <- readRDS(system.file("texts", "stanford.rds", package = "NLP"))
## Extract features of all *word* annotations in doc:
x <- features(doc, "word")
## Could also have abbreviated "word" to "w".
x
## Only lemmas:
x$lemma
## Words together with lemmas:
paste(words(doc), x$lemma, sep = "/")
# }
Run the code above in your browser using DataLab