mallet.topic.labels: Get strings containing the most probable words for each topic

Description

This function returns a vector of strings, one for each topic, with the most probable words in that topic separated by spaces.

Usage

mallet.topic.labels(topic.model, topic.words = NULL, num.top.words = 3, ...)

Value

a character vector with one element per topic

Arguments

topic.model: A cc.mallet.topics.RTopicModel object created by MalletLDA.
topic.words: The matrix of topic-word weights returned by mallet.topic.words Default (NULL) is to use the topic.model to extract the topic.words.
num.top.words: The number of words to include for each topic. Defaults to 3.
...: Further arguments supplied to mallet.topic.words.

Examples

Run this code

if (FALSE) {
# Read in sotu example data
data(sotu)
sotu.instances <-
   mallet.import(id.array = row.names(sotu),
                 text.array = sotu[["text"]],
                 stoplist = mallet_stoplist_file_path("en"),
                 token.regexp = "\\p{L}[\\p{L}\\p{P}]+\\p{L}")

# Create topic model
topic.model <- MalletLDA(num.topics=10, alpha.sum = 1, beta = 0.1)
topic.model$loadDocuments(sotu.instances)

# Train topic model
topic.model$train(200)

# Create hiearchical clusters of topics
doc_topics <- mallet.doc.topics(topic.model, smoothed=TRUE, normalized=TRUE)
topic_words <- mallet.topic.words(topic.model, smoothed=TRUE, normalized=TRUE)
topic_labels <- mallet.topic.labels(topic.model)
plot(mallet.topic.hclust(doc_topics, topic_words, balance = 0.3), labels=topic_labels)
}

Run the code above in your browser using DataLab

Description

Usage

Value

Arguments

See Also

Examples