get_document

0th

Percentile

Access document meta data from an annotation object

Access document meta data from an annotation object

Usage
get_document(annotation)
Arguments
annotation

an annotation object

Value

Returns an object of class c("tbl_df", "tbl", "data.frame") containing one row for every document in the corpus.

The returned data frame includes at least the following columns:

  • "id" - integer. Id of the source document.

  • "time" - date time. The time at which the parser was run on the text.

  • "version" - character. Version of the CoreNLP library used to parse the text.

  • "language" - character. Language of the text, in ISO 639-1 format.

  • "uri" - character. Description of the raw text location. Set to NA if parsed from in-memory character vector.

Other application specific columns may be included as additional variables.

References

Manning, Christopher D., Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David McClosky. 2014. The Stanford CoreNLP Natural Language Processing Toolkit. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55-60.

Aliases
  • get_document
Examples
# NOT RUN {
data(obama)

get_document(obama)


# }
Documentation reproduced from package cleanNLP, version 1.10.0, License: LGPL-2

Community examples

Looks like there are no examples yet.