Learn R Programming

stm (version 1.1.3)

plot.STM: Plot summary of an STM object

Description

Produces one of four types of plots for an STM object. The default option "summary" prints topic words with their corpus frequency. "labels" is for easy printing of tables of indicative words for each topic. "perspectives" depicts differences between two topics, content covariates or combinations. "hist" creates a histogram of the expected distribution of topic proportions across the documents.

Usage

"plot"(x, type = c("summary", "labels", "perspectives", "hist"), n = NULL, topics = NULL, labeltype=c("prob", "frex", "lift", "score"), frexw = 0.5, main = NULL, xlim = NULL, ylim = NULL, xlab = NULL, family = "", width = 80, covarlevels = NULL, plabels = NULL, text.cex=1, custom.labels=NULL, topic.names=NULL, ...)

Arguments

x
Model output from stm.
type
Sets the desired type of plot. See details for more information.
n
Sets the number of words used to label each topic. In perspective plots it approximately sets the total number of words in the plot. The defaults are 3, 20 and 25 for summary, labels and perspectives respectively.
topics
Vector of topics to display. For plot perspectives this must be a vector of length one or two. For the other two types it defaults to all topics.
labeltype
Determines which option of "prob", "frex", "lift", "score" is used for choosing the most important words. See labelTopics for more detail. Passing an argument to custom.labels will overide this.
frexw
If "frex" labeltype is used, this will be the frex weight.
main
Title to the plot
xlim
Range of the X-axis.
ylim
Range of the Y-axis.
xlab
Labels for the X-axis. For perspective plots, use plabels instead.
family
The Font family. Most of the time the user will not need to specify this but if using other character sets can be useful see par.
width
Sets the width in number of characters used for string wrapping in type "labels"
covarlevels
A vector of length one or length two which contains the levels of the content covariate to be used in perspective plots.
plabels
This option can be used to override the default labels in the perspective plot that appear along the x-axis. It should be a character vector of length two which has the left hand side label first.
text.cex
Controls the scaling constant on text size.
custom.labels
A vector of custom labels if labeltype is equal to "custom".
topic.names
A vector of custom topic names. Defaults to "Topic #: ".
...
Additional parameters passed to ploting functions.

Details

The function can produce three types of plots which summarize an STM object which is chosen by the argument type. summary produces a plot which displays the topics ordered by their expected frequency across the corpus. labels plots the top words selected according to the chosen criteria for each selected topics. perspectives plots two topic or topic-covariate combinations. Words are sized proportional to their use within the plotted topic-covariate combinations and oriented along the X-axis based on how much they favor one of the two configurations. If the words cluster on top of each other the user can either set the plot size to be larger or shrink the total number of words on the plot. The vertical configuration of the words is random and thus can be rerun to produce different results each time. hist plots a histogram of the MAP estimates of the document-topic loadings across all documents. The median is also denoted by a dashed red line.

References

Roberts, Margaret E., Brandon M. Stewart, Dustin Tingley, Christopher Lucas, Jetson Leder-Luis, Shana Kushner Gadarian, Bethany Albertson, and David G. Rand. "Structural Topic Models for Open-Ended Survey Responses." American Journal of Political Science 58, no 4 (2014): 1064-1082.

See Also

plotQuote, plot.topicCorr

Examples

Run this code
#Examples with the Gadarian Data
plot(gadarianFit)
plot(gadarianFit,type="labels")
plot(gadarianFit, type="perspectives", topics=c(1,2))
plot(gadarianFit,type="hist")

Run the code above in your browser using DataLab