Extract bigrams instead of words (currently not taking utterance boundaries into account)
frequencytable2(t, acceptedPOS = postags$de$words, names = FALSE,
column = c("Token.surface", "Token.surface"), byCharacter = FALSE,
segment = c("Drama", "Act", "Scene"))
The text
A list of accepted pos tags
Whether to use character names or ids
The column names we should use (should be either Token.surface or Token.lemma)
Wether the count is by character or by text
Whether the count is by drama (default), act or scene
Matrix of bigram frequencies in the format bigrams X segments