Learn R Programming

cleanNLP (version 1.5.2)

get_dependency: Access dependencies from an annotation object

Description

This function grabs the table of dependencies from an annotation object. These are binary relationships between the tokens of a sentence. Common examples include nominal subject (linking the object of a sentence to a verb), and adjectival modifiers (linking an adjective to a noun). While not included in the underlying data, the function has an option for linking these dependencies to the raw words and lemmas in the table of tokens. Both language-agnostic and language-specific universal dependency types are included in the output.

Usage

get_dependency(annotation, get_token = FALSE)

Arguments

annotation
an annotation object
get_token
logical. Should words and lemmas be attached to the returned dependency table.

Value

Returns an object of class c("tbl_df", "tbl", "data.frame") containing one row for every dependency pair in the corpus.

The returned data frame includes at a minimum the following columns:

  • "id" - integer. Id of the source document.
  • "sid" - integer. Sentence id of the source token.
  • "tid" - integer. Id of the source token.
  • "tid_target" - integer. Id of the source token.
  • "relation" - character. Language-agnostic universal dependency type.
  • "relation_full" - character. Language specific universal dependency type.

If get_token is set to true, the following columns will also be included:

  • "word" - character. The source word in the raw text.
  • "lemma" - character. Lemmatized form of the source word.
  • "word_target" - character. The target word in the raw text.
  • "lemma_target" - character. Lemmatized form of the target word.

References

Manning, Christopher D., Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David McClosky. 2014. http://nlp.stanford.edu/pubs/StanfordCoreNlp2014.pdf. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55-60.

Danqi Chen and Christopher D Manning. 2014. A Fast and Accurate Dependency Parser using Neural Networks. In: Proceedings of EMNLP 2014

Spence Green, Marie-Catherine de Marneffe, John Bauer, and Christopher D. Manning. 2010. Multiword Expression Identification with Tree Substitution Grammars: A Parsing tour de force with French. In: EMNLP 2011.

Spence Green and Christopher D. Manning. 2010. Better Arabic Parsing: Baselines, Evaluations, and Analysis. In: COLING 2010.

Pi-Chuan Chang, Huihsin Tseng, Dan Jurafsky, and Christopher D. Manning. 2009. Discriminative Reordering with Chinese Grammatical Relations Features. In: Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation.

Anna Rafferty and Christopher D. Manning. 2008. Parsing Three German Treebanks: Lexicalized and Unlexicalized Baselines. In: ACL Workshop on Parsing German.

Examples

Run this code
data(obama)

# find the most common noun lemmas that are the syntactic subject of a clause
res <- get_dependency(obama, get_token = TRUE) %>%
  filter(relation == "nsubj")
res$lemma_target %>%
  table() %>%
  sort(decreasing = TRUE) %>%
  head(n = 40)

Run the code above in your browser using DataLab