mallet (version 1.3.0)

mallet.topic.labels: Get strings containing the most probable words for each topic

Description

This function returns a vector of strings, one for each topic, with the most probable words in that topic separated by spaces.

Usage

mallet.topic.labels(topic.model, topic.words = NULL, num.top.words = 3, ...)

Value

a character vector with one element per topic

Arguments

topic.model

A cc.mallet.topics.RTopicModel object created by MalletLDA.

topic.words

The matrix of topic-word weights returned by mallet.topic.words Default (NULL) is to use the topic.model to extract the topic.words.

num.top.words

The number of words to include for each topic. Defaults to 3.

...

Further arguments supplied to mallet.topic.words.

See Also

mallet.topic.words produces topic-word weights. mallet.top.words produces a data frame for a single topic.

Examples

Run this code
if (FALSE) {
# Read in sotu example data
data(sotu)
sotu.instances <-
   mallet.import(id.array = row.names(sotu),
                 text.array = sotu[["text"]],
                 stoplist = mallet_stoplist_file_path("en"),
                 token.regexp = "\\p{L}[\\p{L}\\p{P}]+\\p{L}")

# Create topic model
topic.model <- MalletLDA(num.topics=10, alpha.sum = 1, beta = 0.1)
topic.model$loadDocuments(sotu.instances)

# Train topic model
topic.model$train(200)

# Create hiearchical clusters of topics
doc_topics <- mallet.doc.topics(topic.model, smoothed=TRUE, normalized=TRUE)
topic_words <- mallet.topic.words(topic.model, smoothed=TRUE, normalized=TRUE)
topic_labels <- mallet.topic.labels(topic.model)
plot(mallet.topic.hclust(doc_topics, topic_words, balance = 0.3), labels=topic_labels)
}

Run the code above in your browser using DataLab