augment.tidylda: Augment method for `tidylda` objects

Description

augment appends observation level model outputs.

Usage

# S3 method for tidylda
augment(
  x,
  data,
  type = c("class", "prob"),
  document_col = "document",
  term_col = "term",
  ...
)

Value

augment returns a tidy tibble containing one row per document-token pair, with one or more columns appended, depending on the value of type.

If type = 'prob', then one column per topic is appended. Its value is P(topic | document, token).

If type = 'class', then the most-probable topic for each document-token pair is returned. If multiple topics are equally probable, then the topic with the smallest index is returned by default.

Arguments

x: an object of class tidylda
data: a tidy tibble containing one row per original document-token pair, such as is returned by tdm_tidiers with column names c("document", "term") at a minimum.
type: one of either "class" or "prob"
document_col: character specifying the name of the column that corresponds to document IDs. Defaults to "document".
term_col: character specifying the name of the column that corresponds to term/token IDs. Defaults to "term".
...: other arguments passed to methods,currently not used

Details

The key statistic for augment is P(topic | document, token) = P(topic | token) * P(token | document). P(topic | token) are the entries of the 'lambda' matrix in the tidylda object passed with x. P(token | document) is taken to be the frequency of each token normalized within each document.