Learn R Programming

textometry (version 0.1.3)

specificities: Calculate Lexical Specificity Score

Description

Calculate the specificity - or association or surprise - score of a word being present f times or more in a sub-corpus of t words given that it appears a total of F times in a whole corpus of T words.

Usage

specificities(lexicaltable, types=NULL, parts=NULL)

Arguments

lexicaltable
a complete lexical table, i.e. a numeric matrix where each line represents a word and each column a part of the corpus. Each cell gives the frequency of the given word in the corresponding part of the corpus.
types
list of rows (words) for which the specificity score must be calculated. If NULL, the specificity score is calculated for every row; If types is a character vector, it indicates the row names for which the specificity score
parts
list of columns (parts) for which the specificity score must be calculated. If NULL, the specificity index is calculated for every part; If parts is a character vector, it indicates the column names for which the specificit

Value

  • Returns a matrix of nrow(lexicaltable) * ncol(lexicaltable) (the number of rows and columns may be reduced using types or parts), each cell giving the specificity score.

References

Lafon P. (1980) Sur la variabilit'e de la fr'e quence des formes dans un corpus, Mots, 1, pp. 127--165. http://www.persee.fr/web/revues/home/prescript/article/mots_0243-6450_1980_num_1_1_1008

See Also

specificities.probabilities, specificities.lexicon

Examples

Run this code
data(robespierre);
spe <- specificities(robespierre);
string <- paste("The word %s appears f=%d times in a sub-corpus of t=%d words,",
"given a total frequency of F=%d in the robespierre corpus made",
"of T=%d words. The corresponding specificity score is %f", sep="");
print(sprintf(string,
'peuple',
robespierre['peuple','D4'],
colSums(robespierre)['D4'],
rowSums(robespierre)['peuple'],
sum(robespierre),
spe['peuple', 'D4']));

Run the code above in your browser using DataLab