DramaAnalysis (version 3.0.0)

dictionaryStatistics: Dictionary Use

Description

These methods retrieve count the number of occurrences of the words in the dictionaries, across different speakers and/or segments. The function dictionaryStatistics() calculates statistics for dictionaries with multiple entries, dictionaryStatisticsSingle() only for a single word list.

Extract the number part from a QDDictionaryStatistics table as a matrix

Usage

dictionaryStatistics(drama,
  fields = DramaAnalysis::base_dictionary[fieldnames],
  fieldnames = c("Liebe"), segment = c("Drama", "Act", "Scene"),
  normalizeByCharacter = FALSE, normalizeByField = FALSE,
  byCharacter = TRUE, column = "Token.lemma", ci = TRUE)

dictionaryStatisticsSingle(drama, wordfield = c(), segment = c("Drama", "Act", "Scene"), normalizeByCharacter = FALSE, normalizeByField = FALSE, byCharacter = TRUE, fieldNormalizer = length(wordfield), column = "Token.lemma", ci = TRUE, colnames = NULL)

# S3 method for QDDictionaryStatistics as.matrix(x, ...)

Arguments

drama

A QDDrama object.

fields

A list of lists that contains the actual field names. By default, we load the base_dictionary.

fieldnames

A list of names for the dictionaries.

segment

The segment level that should be used. By default, the entire play will be used. Possible values are "Drama" (default), "Act" or "Scene".

normalizeByCharacter

Logical. Whether to normalize by character speech length.

normalizeByField

Logical. Whether to normalize by dictionary size. You usually want this.

byCharacter

Logical, defaults to TRUE. If false, values will be calculated for the entire segment (play, act, or scene), and not for individual characters.

column

The table column we apply the dictionary on. Should be either "Token.surface" or "Token.lemma", the latter is the default.

ci

Whether to ignore case. Defaults to TRUE, i.e., case is ignored.

wordfield

A character vector containing the words or lemmas to be counted (only for *Single-functions)

fieldNormalizer

Defaults to the length of the wordfield. If normalizeByField is given, the absolute numbers are divided by this number.

colnames

The column names to be used in the output table.

x

An object of the type QDDictionaryStatistics, e.g., the output of dictionaryStatistics.

...

All other parameters are passed to as.matrix.data.frame().

Value

A numeric matrix that contains the frequency with which a dictionary is present in a subset of tokens

See Also

loadFields characterNames

Examples

Run this code
# NOT RUN {
# Check multiple dictionary entries
data(rksp.0)
dstat <- dictionaryStatistics(rksp.0, fieldnames=c("Krieg","Familie"))
# Check a single dictionary entries
data(rksp.0)
fstat <- dictionaryStatisticsSingle(rksp.0, wordfield=c("der"))
mat <- as.matrix(dictionaryStatistics(rksp.0, fieldnames=c("Krieg","Familie")))
# }

Run the code above in your browser using DataLab