powered by
This function takes a fitted word embedding model and computes the cosine similarity between each word.
similarity_matrix(x, words = NULL, max_terms = 25000)
An N x N matrix of cosine similarity scores between words from a fitted word embedding model.
A word embedding matrix
A vector of words or the name of a column that corresponds to the word dimension of the fitted word embeddings
The maximum number of embedding terms that will be included in output similarity matrix. Assumes that embedding input is ordered by word frequency.
# Create a set of keywords using a pre-defined set of seeds simmat <- similarity_matrix(wordemb_FasttextEng_sample, words = "words")
Run the code above in your browser using DataLab