mallet (version 1.3.0)

mallet.topic.words: Retrieve a matrix of words weights for topics

Description

This function returns a matrix with one row for every topic and one column for every word in the vocabulary.

Usage

mallet.topic.words(topic.model, normalized = FALSE, smoothed = FALSE)

Value

a number of topics by vocabulary size matrix.

Arguments

topic.model

A cc.mallet.topics.RTopicModel object created by MalletLDA.

normalized

If TRUE, normalize the rows so that each topic sums to one. If FALSE, values will be integers (possibly plus the smoothing constant) representing the actual number of words of each type in the topics.

smoothed

If TRUE, add the smoothing parameter for the model (initial value specified as beta in MalletLDA). If FALSE, many values will be zero.

Examples

Run this code
if (FALSE) {
# Read in sotu example data
data(sotu)
sotu.instances <-
   mallet.import(id.array = row.names(sotu),
                 text.array = sotu[["text"]],
                 stoplist = mallet_stoplist_file_path("en"),
                 token.regexp = "\\p{L}[\\p{L}\\p{P}]+\\p{L}")

# Create topic model
topic.model <- MalletLDA(num.topics=10, alpha.sum = 1, beta = 0.1)
topic.model$loadDocuments(sotu.instances)

# Train topic model
topic.model$train(200)

# Extract results
doc_topics <- mallet.doc.topics(topic.model, smoothed=TRUE, normalized=TRUE)
topic_words <- mallet.topic.words(topic.model, smoothed=TRUE, normalized=TRUE)
top_words <- mallet.top.words(topic.model, word.weights = topic_words[2,], num.top.words = 5)
}

Run the code above in your browser using DataLab