Learn R Programming

stm (version 1.1.3)

findThoughts: Find Thoughts

Description

Outputs most representative documents for a particular topic. Use this in order to get a better sense of the content of actual documents with a high topical content.

Usage

findThoughts(model, texts=NULL, topics=NULL, n=3, thresh=0.0)

Arguments

model
Model object created by stm.
texts
A character vector where each entry contains the text of a document. Must be in the same order as the documents object.
topics
The topic number or vector of topic numbers for which you want to find thoughts. Defaults to all topics.
n
The number of desired documents to be displayed per topic.
thresh
Sets a minimum threshold for the estimated topic proportion for displayed documents. It defaults to imposing no restrictions.

Value

A findThoughts object
index
List with one entry per topic. Each entry is a vector of document indices.
docs
List with one entry per topic. Each entry is a character vector of the corresponding texts.

Details

Returns the top n documents ranked by the MAP estimate of the topic's theta value (which captures the modal estimate of the proportion of word tokens assigned to the topic under the model). Setting the thresh argument allows the user to specify a minimal value of theta for returned documents. Returns document indices and top thoughts.

The plot.findThoughts function is a shortcut for the plotQuote function.

See Also

plotQuote

Examples

Run this code
findThoughts(gadarianFit, texts=gadarian$open.ended.response, topics=c(1,2), n=3)

#We can plot findThoughts objects using plot() or plotQuote
thought <- findThoughts(gadarianFit, texts=gadarian$open.ended.response, topics=1, n=3)

#plotQuote takes a set of sentences
plotQuote(thought$docs[[1]])

#we can use the generic plot as a shorthand which will make one plot per topic
plot(thought)

#we can select a subset of examples as well using either approach
plot(thought,2:3)
plotQuote(thought$docs[[1]][2:3])

Run the code above in your browser using DataLab