plot.STM: Plot summary of an STM object

Description

Produces one of four types of plots for an STM object. The default option "summary" prints topic words with their corpus frequency. "labels" is for easy printing of tables of indicative words for each topic. "perspectives" depicts differences between two topics, content covariates or combinations. "hist" creates a histogram of the expected distribution of topic proportions across the documents.

Usage

"plot"(x, type = c("summary", "labels", "perspectives", "hist"), n = NULL, topics = NULL,  labeltype=c("prob", "frex", "lift", "score"),  frexw = 0.5,  main = NULL, xlim = NULL, ylim = NULL, xlab = NULL,  family = "", width = 80,  covarlevels = NULL, plabels = NULL, text.cex=1,  custom.labels=NULL, topic.names=NULL, ...)

Arguments

Model output from stm.

type

Sets the desired type of plot. See details for more information.

Sets the number of words used to label each topic. In perspective plots it approximately sets the total number of words in the plot. The defaults are 3, 20 and 25 for summary, labels and perspectives respectively.

topics

Vector of topics to display. For plot perspectives this must be a vector of length one or two. For the other two types it defaults to all topics.

labeltype

Determines which option of "prob", "frex", "lift", "score" is used for choosing the most important words. See labelTopics for more detail. Passing an argument to custom.labels will overide this.

frexw

If "frex" labeltype is used, this will be the frex weight.

main

Title to the plot

xlim

Range of the X-axis.

ylim

Range of the Y-axis.

xlab

Labels for the X-axis. For perspective plots, use plabels instead.

family

The Font family. Most of the time the user will not need to specify this but if using other character sets can be useful see par.

width

Sets the width in number of characters used for string wrapping in type "labels"

covarlevels

A vector of length one or length two which contains the levels of the content covariate to be used in perspective plots.

plabels

This option can be used to override the default labels in the perspective plot that appear along the x-axis. It should be a character vector of length two which has the left hand side label first.

text.cex

Controls the scaling constant on text size.

custom.labels

A vector of custom labels if labeltype is equal to "custom".

topic.names

A vector of custom topic names. Defaults to "Topic #: ".

...

Additional parameters passed to ploting functions.

Details

The function can produce three types of plots which summarize an STM object which is chosen by the argument type. summary produces a plot which displays the topics ordered by their expected frequency across the corpus. labels plots the top words selected according to the chosen criteria for each selected topics. perspectives plots two topic or topic-covariate combinations. Words are sized proportional to their use within the plotted topic-covariate combinations and oriented along the X-axis based on how much they favor one of the two configurations. If the words cluster on top of each other the user can either set the plot size to be larger or shrink the total number of words on the plot. The vertical configuration of the words is random and thus can be rerun to produce different results each time. hist plots a histogram of the MAP estimates of the document-topic loadings across all documents. The median is also denoted by a dashed red line.

References

Roberts, Margaret E., Brandon M. Stewart, Dustin Tingley, Christopher Lucas, Jetson Leder-Luis, Shana Kushner Gadarian, Bethany Albertson, and David G. Rand. "Structural Topic Models for Open-Ended Survey Responses." American Journal of Political Science 58, no 4 (2014): 1064-1082.

Examples

Run this code

#Examples with the Gadarian Data
plot(gadarianFit)
plot(gadarianFit,type="labels")
plot(gadarianFit, type="perspectives", topics=c(1,2))
plot(gadarianFit,type="hist")