vocabularyDlg: Vocabulary Summary
Description
Build vocabulary summary table over documents or a meta-data variable of a corpus.dQuote
- Document
- Level
- level
- document
- Corpus mean
- Corpus total
Details
This dialog allows creating tables providing several vocabulary measures
for each document of a corpus, or each of the categories of a corpus variable:
- number and percent of unique words, i.e. of words appearing at least once
- number and percent of hapax legomena, i.e. terms appearing once and only once
- total number of words
- number and percent of long words (long being defined as at
least 7 characters
- number and percent of very long words (very long being defined as
at least 10 characters
- average word length