DramaAnalysis (version 3.0.0)

frequencytable2: Extract bigrams instead of words (currently not taking utterance boundaries into account)

Description

Extract bigrams instead of words (currently not taking utterance boundaries into account)

Usage

frequencytable2(t, acceptedPOS = postags$de$words, names = FALSE,
  column = c("Token.surface", "Token.surface"), byCharacter = FALSE,
  segment = c("Drama", "Act", "Scene"))

Arguments

t

The text

acceptedPOS

A list of accepted pos tags

names

Whether to use character names or ids

column

The column names we should use (should be either Token.surface or Token.lemma)

byCharacter

Wether the count is by character or by text

segment

Whether the count is by drama (default), act or scene

Value

Matrix of bigram frequencies in the format bigrams X segments