Brown.tfl
, Brown.spc
and Brown.emp.vgc
are
zipfR
objects of classes tfl
,
spc
and vgc
, respectively.
These data were extracted from the Brown corpus (see Kucera and Francis 1967).
Brown.emp.vgc
is the empirical vocabulary growth
curve, reflecting the V
and V(1)
development in the
non-randomized corpus.
We removed numbers and other forms of non-linguistic material before collecting word counts from the Brown.
Kucera, H. and Francis, W.N. (1967). Computational analysis of present-day American English. Brown University Press, Providence.
The datasets documented in BrownSubsets
pertain to
various subsets of the Brown (e.g., informative prose, adjectives
only, etc.)
# NOT RUN {
data(Brown.tfl)
summary(Brown.tfl)
data(Brown.spc)
summary(Brown.spc)
data(Brown.emp.vgc)
summary(Brown.emp.vgc)
# }
Run the code above in your browser using DataLab