find_nns: Return nearest neighbors based on cosine similarity
Description
Return nearest neighbors based on cosine similarity
Usage
find_nns(
target_embedding,
pre_trained,
N = 5,
candidates = NULL,
norm = "l2",
stem = FALSE,
language = "porter"
)
Value
(character) vector of nearest neighbors to target
Arguments
target_embedding
(numeric) 1 x D matrix. D = dimensions of pretrained embeddings.
pre_trained
(numeric) a F x D matrix corresponding to pretrained embeddings.
F = number of features and D = embedding dimensions.
rownames(pre_trained) = set of features for which there is a pre-trained embedding.
N
(numeric) number of nearest neighbors to return.
candidates
(character) vector of candidate features for nearest neighbors
norm
(character) - how to compute similarity (see ?text2vec::sim2):
"l2"
cosine similarity
"none"
inner product
stem
(logical) - whether to stem candidates when evaluating nns. Default is FALSE.
If TRUE, candidate stems are ranked by their average cosine similarity to the target.
We recommend you remove misspelled words from candidate set candidates as these can
significantly influence the average.
language
the name of a recognized language, as returned by
getStemLanguages, or a two- or three-letter ISO-639
code corresponding to one of these languages (see references for
the list of codes).