Efron & Thisted used this data to ask the question, "How many words did
Shakespeare know?" Put another way, suppose another new corpus of works
Shakespeare were discovered, also with 884,647 words. How many new word
types would appear? The answer to the main question involves contemplating
an infinite number of such new corpora.
In addition to the words that appear 1:100 times, there are 846 words
that appear more than 100 times, not listed in this data set.