kRp.POS.tags(lang = get.kRp.env(lang = TRUE), list.classes = FALSE, list.tags = FALSE, tags = c("words", "punct", "sentc"))TRUE only the known word classes for the chosen language will me returned.TRUE only the POS tags for the chosen language will me returned.list.classes=FALSE and list.tags=FALSE returns a matrix with word tag definitions of the given language.
The matrix has three columns:
tag:class:desc:list.classes and list.tags are TRUE,
still only the POS tags will be returned.
"de" --- German, according to the STTS guidelines (Schiller, Teufel,
Stockert, & Thielen, 1995)
"en" --- English, according to the Penn Treebank guidelines (Santorini,
1991)
"es" --- Spanish,
according to http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/data/spanish-tagset.txt
"fr" --- French,
according to http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/data/french-tagset.html
"it" --- Italian,
according to http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/data/italian-tagset.txt
and http://sslmit.unibo.it/~baroni/collocazioni/itwac.tagset.txt, respectively
"ru" --- Russian, according to the MSD tagset by Sharoff, Kopotev, Erjavec,
Feldman & Divjak (2008)
For the internal tokenizer a small subset of tags is also defined,
available through lang="kRp". If you don't know the language your text was written in,
the function guess.lang
should be able to detect it.
With the element tags you can specify if you want all tag definitions, or a subset,
e.g. tags only for punctuation and
sentence endings (that is,
you need to call for both "punct" and "sentc" to get all punctuation tags).
The function is not so much intended to be used directly, but it is called by several other functions internally. However, it can still be useful to directly examine available POS tags.
Schiller, A., Teufel, S., Stockert, C. & Thielen, C. (1995). Vorl\"aufge Guidelines f\"ur das Tagging deutscher Textcorpora mit STTS. URL: http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/data/stts_guide.pdf
Sharoff, S., Kopotev, M., Erjavec, T., Feldman, A. & Divjak, D. (2008). Designing and evaluating Russian tagsets. In: Proc. LREC 2008, Marrakech. URL: http://corpus.leeds.ac.uk/mocky/
get.kRp.env