words(x, ...)
sents(x, ...)
paras(x, ...)
tagged_words(x, ...)
tagged_sents(x, ...)
tagged_paras(x, ...)
chunked_sents(x, ...)
parsed_sents(x, ...)
parsed_paras(x, ...)
words()
, a character vector with the word tokens in the
document. For sents()
, a list of character vectors with the word tokens
in each sentence.
For paras()
, a list of lists of character vectors with the word
tokens in each sentence, grouped according to the paragraphs.
For tagged_words()
, a character vector with the POS tagged word
tokens in the document (i.e., the word tokens and their POS tags,
separated by /).
For tagged_sents()
, a list of character vectors with the POS
tagged word tokens in each sentence.
For tagged_paras()
, a list of lists of character vectors with
the POS tagged word tokens in each sentence, grouped according to the
paragraphs.
For chunked_sents()
, a list of (flat) Tree
objects giving the chunk trees for each sentence in the document.
For parsed_sents()
, a list of Tree
objects giving the parse trees for each sentence in the document.
For parsed_paras()
, a list of lists of Tree
objects giving the parse trees for each sentence in the document,
grouped according to the paragraphs in the document.
TextDocument
for basic information on the text document
infrastructure employed by package