Last chance! 50% off unlimited learning
Sale ends in
Function calls GetProbableTerms
with some
rules to get topic labels. This function is in "super-ultra-mega alpha"; use
at your own risk/discretion.
LabelTopics(assignments, dtm, M = 2)
A documents by topics matrix similar to theta
.
This will work best if this matrix is sparse, with only a few non-zero topics
per document.
A document term matrix of class matrix
or dgCMatrix
.
The columns of dtm
should be n-grams whose colnames have a "_" where
spaces would be between the words.
The number of n-gram labels you want to return. Defaults to 2
Returns a matrix
whose rows correspond to topics and whose
j-th column corresponds to the j-th "best" label assignment.
# NOT RUN {
# make a dtm with unigrams and bigrams
data(nih_sample_topic_model)
m <- nih_sample_topic_model
assignments <- t(apply(m$theta, 1, function(x){
x[ x < 0.05 ] <- 0
x / sum(x)
}))
assignments[is.na(assignments)] <- 0
labels <- LabelTopics(assignments = assignments, dtm = m$data, M = 2)
# }
Run the code above in your browser using DataLab