Access document meta data from an annotation object
get_document(annotation)
an annotation object
Returns an object of class c("tbl_df", "tbl", "data.frame")
containing one row for every document in the corpus.
The returned data frame includes at least the following columns:
"id" - integer. Id of the source document.
"time" - date time. The time at which the parser was run on the text.
"version" - character. Version of the CoreNLP library used to parse the text.
"language" - character. Language of the text, in ISO 639-1 format.
"uri" - character. Description of the raw text location.
Set to NA
if parsed from in-memory character
vector.
Other application specific columns may be included as additional variables.
Manning, Christopher D., Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David McClosky. 2014. The Stanford CoreNLP Natural Language Processing Toolkit. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55-60.
# NOT RUN {
data(obama)
get_document(obama)
# }
Run the code above in your browser using DataLab