Takes a fitted embedding model as an input. Allows users to combine embeddings by the case, stem, or lemma of associated terms.
process_embed(
x,
words = NULL,
punct = TRUE,
tolower = TRUE,
lemmatize = TRUE,
stem = FALSE
)A data frame with the same columns as the input, but with redundant terms combined.
A fitted word embedding model in the data frame format
The name of a column that corresponds to the word dimension of the fitted word embeddings
Removes punctuation
Combines terms that differ by case
Combines terms that share a common lemma. Uses the lexicon package by default.
Combines terms that share a common stem. Note: Stemming should not be used in conjunction with lemmatize.