Learn R Programming

RcmdrPlugin.temis (version 0.6.1)

termFreqDlg: Term frequencies in the corpus

Description

Study frequencies of chosen terms in the corpus, among documents, or among levels of a variable.

Arguments

Details

This dialog allows creating a table providing information about the frequency of chosen terms among documents or levels of a variable. If None (whole corpus) is selected, the absolute frequency of the chosen terms and their percents in occurrences of all terms in the corpus are returned. If Document or a variable is chosen, details about the association of the term with documents or levels are shown: [object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

The probability is that of observing such extreme frequencies of the considered term in the level, under an hypergeometric distribution based on its global frequency in the corpus and on the number of occurrences of all terms in the document or variable level considered. The positive or negative character of the association is visible from the sign of the t value, or by comparing the value of the % Term/Level column with that of the Global % column.

The kind of plot to be drawn is automatically chosen from the selected measure. Row percents lead to bar plots, since the total sum of shown columns (terms) doesn't add up to 100 to be drawn. Absolute counts are also represented with bar plots, so that the vertical axis reports number of occurrences.

When either several pie charts are drawn for each word, or a single word has been entered, the string %T in the plot title will be replaced with the name of the term. In all cases, the string %V will be replaced with the name of the selected variable.

See Also

termFrequencies, setCorpusVariables, meta, DocumentTermMatrix, barchart, pie