zipfR (version 0.6-66)

Tiger: Tiger NP and PP expansions (zipfR)

Description

Objects of classes tfl, spc and vgc that contain frequency data for the syntactic expansions of Noun Phrases (NP) and Prepositional Phrases (PP) in the Tiger German treebank.

Usage

TigerNP.tfl
TigerNP.spc
TigerNP.emp.vgc

TigerPP.tfl TigerPP.spc TigerPP.emp.vgc

Arguments

Details

In this dataset, types are not words, but syntactic expansions, i.e., sequences of syntactic categories that form NPs (in TigerNP) or PPs (in TigerPP), according to the Tiger annotation scheme for German. Thus, for example, among the expansion types in the TigerNP dataset, we find ART_NN and ART_ADJA_NN, whereas among the PP expansions in TigerPP we find APPR_ART_NN and APPR_NN (APPR is the tag for prepositions in the Tiger tagset).

The Tiger treebank contains about 900,000 tokens (50,000 sentences) of German newspaper text from the Frankfurter Rundschau. The token frequencies of the expansion types are taken from this corpus.

TigerNP.tfl and TigerPP.tfl are the type frequency lists. TigerNP.spc and TigerPP.spc are frequency spectra. TigerNP.emp.vgc and TigerPP.emp.vgc are the corresponding observed vocabulary growth curves (tracking the development of V and V(1) in the original order of occurrence of the expansion tokens in the source corpus).

References

Tiger Project: http://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/tiger.html

Examples

Run this code
# NOT RUN {
TigerNP.tfl
summary(TigerNP.spc)
summary(TigerNP.emp.vgc)

TigerPP.tfl
summary(TigerPP.spc)
summary(TigerPP.emp.vgc)

# }

Run the code above in your browser using DataLab