Get summary information of a top2vec model. Namely the topic centers and the most similar words to a certain topic
# S3 method for top2vec
summary(
object,
type = c("similarity", "c-tfidf"),
top_n = 10,
data = object$data,
embedding_words = object$embedding$words,
embedding_docs = object$embedding$docs,
...
)an object of class top2vec as returned by top2vec
a character string with the type of summary information to extract for the topwords. Either 'similarity' or 'c-tfidf'. The first extracts most similar words to the topic based on semantic similarity, the second by extracting the words with the highest tf-idf score for each topic
integer indicating to find the top_n most similar words to a topic
a data.frame with columns `doc_id` and `text` representing documents.
For each topic, the function extracts the most similar documents.
And in case type is 'c-tfidf' it get the words with the highest tf-idf scores for each topic.
a matrix of word embeddings to limit the most similar words to. Defaults to
the embedding of words from the object
a matrix of document embeddings to limit the most similar documents to. Defaults to
the embedding of words from the object
not used
# For an example, look at the documentation of ?top2vec
Run the code above in your browser using DataLab