process_embed

Takes a fitted embedding model as an input. Allows users to combine
embeddings by the case, stem, or lemma of associated terms.

keyclust

A fast and computationally efficient algorithm designed to enable researchers to efficiently and quickly extract semantically-related keywords using a fitted embedding model. For more details about the methods applied, see Chester (2025). <doi:10.17605/OSF.IO/5B7RQ>.

Patrick Chester

A Model for Semi-Supervised Keyword Extraction from Word
Embedding Models

process_embed function

<dl><dt>x</dt>
<dd>A fitted word embedding model in the data frame format</dd>
<dt>words</dt>
<dd>The name of a column that corresponds to the word dimension of the fitted word embeddings</dd>
<dt>punct</dt>
<dd>Removes punctuation</dd>
<dt>tolower</dt>
<dd>Combines terms that differ by case</dd>
<dt>lemmatize</dt>
<dd>Combines terms that share a common lemma. Uses the lexicon package by default.</dd>
<dt>stem</dt>
<dd>Combines terms that share a common stem. Note: Stemming should not be used in conjunction with lemmatize.</dd></dl>

Arguments

A tool designed to reduce redundant terms in a fitted embedding model — process_embed

<dl>

<dt>x</dt>
<dd>A fitted word embedding model in the data frame format</dd>


<dt>words</dt>
<dd>The name of a column that corresponds to the word dimension of the fitted word embeddings</dd>


<dt>punct</dt>
<dd>Removes punctuation</dd>


<dt>tolower</dt>
<dd>Combines terms that differ by case</dd>


<dt>lemmatize</dt>
<dd>Combines terms that share a common lemma. Uses the lexicon package by default.</dd>


<dt>stem</dt>
<dd>Combines terms that share a common stem. Note: Stemming should not be used in conjunction with lemmatize.</dd>

</dl>

process_embed: A tool designed to reduce redundant terms in a fitted embedding model

Description

Usage

Value

Arguments