Learn R Programming

topicmodels.etm (version 0.1.0)

summary.ETM: Project ETM embeddings using UMAP

Description

Uses the uwot package to map the word embeddings and the center of the topic embeddings to a 2-dimensional space

Usage

# S3 method for ETM
summary(object, type = c("umap"), n_components = 2, top_n = 20, ...)

Arguments

object

object of class ETM

type

character string with the type of summary to extract. Defaults to 'umap', no other summary information currently implemented.

n_components

the dimension of the space to embed into. Passed on to umap. Defaults to 2.

top_n

passed on to predict.ETM to get the top_n most relevant words for each topic in the 2-dimensional space

...

further arguments passed onto umap

Value

a list with elements

  • center: a matrix with the embeddings of the topic centers

  • words: a matrix with the embeddings of the words

  • embed_2d: a data.frame which contains a lower dimensional presentation in 2D of the topics and the top_n words associated with the topic, containing columns type, term, cluster (the topic number), rank, beta, x, y, weight; where type is either 'words' or 'centers', x/y contain the lower dimensional positions in 2D of the word and weight is the emitted beta scaled to the highest beta within a topic where the topic center always gets weight 0.8

See Also

umap, ETM

Examples

Run this code
# NOT RUN {
library(torch)
library(topicmodels.etm)
library(uwot)
path     <- system.file(package = "topicmodels.etm", "example", "example_etm.ckpt")
model    <- torch_load(path)
overview <- summary(model, 
                    metric = "cosine", n_neighbors = 15, 
                    fast_sgd = FALSE, n_threads = 1, verbose = TRUE) 
overview$center
overview$embed_2d
# }

Run the code above in your browser using DataLab