# stm_tidiers

##### Tidiers for Structural Topic Models from the stm package

Tidy topic models fit by the stm package. The arguments and return values
are similar to `lda_tidiers`

.

##### Usage

```
# S3 method for STM
tidy(x, matrix = c("beta", "gamma", "theta"),
log = FALSE, document_names = NULL, ...)
```# S3 method for estimateEffect
tidy(x, ...)

# S3 method for STM
augment(x, data, ...)

# S3 method for STM
glance(x, ...)

##### Arguments

- x
An STM fitted model object from either

`stm`

or`estimateEffect`

from the stm package.- matrix
Whether to tidy the beta (per-term-per-topic, default) or gamma/theta (per-document-per-topic) matrix. The stm package calls this the theta matrix, but other topic modeling packages call this gamma.

- log
Whether beta/gamma/theta should be on a log scale, default FALSE

- document_names
Optional vector of document names for use with per-document-per-topic tidying

- ...
Extra arguments, not used

- data
For

`augment`

, the data given to the stm function, either as a`dfm`

from quanteda or as a tidied table with "document" and "term" columns

##### Value

`tidy`

returns a tidied version of either the beta or gamma matrix if
called on an object from `stm`

or a tidied version of the estimated regressions
if called on an object from `estimateEffect`

.

`augment`

must be provided a data argument, either a
`dfm`

from quanteda or a table containing one row per original
document-term pair, such as is returned by tdm_tidiers, containing
columns `document`

and `term`

. It returns that same data as a table
with an additional column `.topic`

with the topic assignment for that
document-term combination.

`glance`

always returns a one-row table, with columns

- k
Number of topics in the model

- docs
Number of documents in the model

- terms
Number of terms in the model

- iter
Number of iterations used

- alpha
If an LDA model, the parameter of the Dirichlet distribution for topics over documents

##### See Also

If `matrix == "beta"`

(default), returns a table with one row per topic and term,
with columns

- topic
Topic, as an integer

- term
Term

- beta
Probability of a term generated from a topic according to the structural topic model

If `matrix == "gamma"`

, returns a table with one row per topic and document,
with columns

- topic
Topic, as an integer

- document
Document name (if given in vector of

`document_names`

) or ID as an integer- gamma
Probability of topic given document

If called on an object from `estimateEffect`

, returns a table with columns

- topic
Topic, as an integer

- term
The term in the model being estimated and tested

- estimate
The estimated coefficient

- std.error
The standard error from the linear model

- statistic
t-statistic

- p.value
two-sided p-value

##### Examples

```
# NOT RUN {
# }
# NOT RUN {
if (requireNamespace("stm", quietly = TRUE)) {
library(dplyr)
library(ggplot2)
library(stm)
library(janeaustenr)
austen_sparse <- austen_books() %>%
unnest_tokens(word, text) %>%
anti_join(stop_words) %>%
count(book, word) %>%
cast_sparse(book, word, n)
topic_model <- stm(austen_sparse, K = 12, verbose = FALSE, init.type = "Spectral")
# tidy the word-topic combinations
td_beta <- tidy(topic_model)
td_beta
# Examine the topics
td_beta %>%
group_by(topic) %>%
top_n(10, beta) %>%
ungroup() %>%
ggplot(aes(term, beta)) +
geom_col() +
facet_wrap(~ topic, scales = "free") +
coord_flip()
# tidy the document-topic combinations, with optional document names
td_gamma <- tidy(topic_model, matrix = "gamma",
document_names = rownames(austen_sparse))
td_gamma
# using stm's gardarianFit, we can tidy the result of a model
# estimated with covariates
effects <- estimateEffect(1:3 ~ treatment, gadarianFit, gadarian)
td_estimate <- tidy(effects)
td_estimate
}
# }
# NOT RUN {
# }
```

*Documentation reproduced from package tidytext, version 0.2.2, License: MIT + file LICENSE*