get_local_vocab: Identify words common to a collection of texts and a set of pretrained embeddings.
Description
Local vocab consists of the intersect between the set of pretrained embeddings
and the collection of texts.
Usage
get_local_vocab(context, pre_trained)
Value
(character) vector of words common to the texts and pretrained embeddings.
Arguments
context
(character) vector of contexts (usually context in get_context() output)
pre_trained
(numeric) a F x D matrix corresponding to pretrained embeddings.
F = number of features and D = embedding dimensions.
rownames(pre_trained) = set of features for which there is a pre-trained embedding.
# find local vocab (use it to define the candidate of nearest neighbors)local_vocab <- get_local_vocab(cr_sample_corpus, pre_trained = cr_glove_subset)