get_token

annotation

<p>This function grabs the table of tokens from an annotation object. There
is exactly one row for each token found in the raw text. Tokens include
words as well as punctuation marks. A token called <code>ROOT</code> is also
added to each sentence; it is particularly useful when interacting with
the table of dependencies.</p>

Provides a set of fast tools for converting a textual corpus into a set of normalized tables. Users may make use of a Python back end with 'spaCy' (<https://spacy.io>) or the Java back end 'CoreNLP' (http://stanfordnlp.github.io/CoreNLP/). A minimal back end with no external dependencies is also provided. Exposed annotation tasks include tokenization, part of speech tagging, named entity recognition, entity linking, sentiment analysis, dependency parsing, coreference resolution, and word embeddings. Summary statistics regarding token unigram, part of speech tag, and dependency type frequencies are also included to assist with analyses.

Taylor Arnold

cleanNLP

get_token: Access tokens from an annotation object

Description

Usage

Arguments

Value

References

Examples