topWords: Top Words per Topic

Description

Determines the top words per topic as top.topic.words do. In addition, it is possible to request the values that are taken for determining the top words per topic. Therefore, the function importance is used, which also can be called independently.

Usage

topWords(topics, numWords = 1, byScore = TRUE, epsilon = 1e-05, values = FALSE)
importance(topics, epsilon = 1e-05)

Value

Matrix of top words or, if value is TRUE a list of matrices with entries word and val.

Arguments

topics: named matrix: The counts of vocabularies (column wise) in topics (row wise).
numWords: integer(1): The number of requested top words per topic.
byScore: logical(1): Should the values that are taken for determining the top words per topic be calculated by the function importance (TRUE) or should the absolute counts be considered (FALSE)?
epsilon: numeric(1): Small number to add to logarithmic calculations to overcome the issue of determining log(0).
values: logical(1): Should the values that are taken for determining the top words per topic be returned?

Examples

Run this code

texts <- list(
 A = "Give a Man a Fish, and You Feed Him for a Day.
      Teach a Man To Fish, and You Feed Him for a Lifetime",
 B = "So Long, and Thanks for All the Fish",
 C = "A very able manipulative mathematician, Fisher enjoys a real mastery
      in evaluating complicated multiple integrals.")

corpus <- textmeta(meta = data.frame(id = c("A", "B", "C", "D"),
  title = c("Fishing", "Don't panic!", "Sir Ronald", "Berlin"),
  date = c("1885-01-02", "1979-03-04", "1951-05-06", "1967-06-02"),
  additionalVariable = 1:4, stringsAsFactors = FALSE), text = texts)

corpus <- cleanTexts(corpus)
wordlist <- makeWordlist(corpus$text)
ldaPrep <- LDAprep(text = corpus$text, vocab = wordlist$words)

LDA <- LDAgen(documents = ldaPrep, K = 3L, vocab = wordlist$words, num.words = 3)
topWords(LDA$topics)

importance(LDA$topics)

Run the code above in your browser using DataLab