sem_search_corpus

Searches a text corpus for specified patterns, with support for parallel processing.

A simple Natural Language Processing (NLP) toolkit focused on search-centric workflows with minimal dependencies. The package offers key features for web scraping, text processing, corpus search, and text embedding generation via the 'HuggingFace API' <https://huggingface.co/docs/api-inference/index>.

Jason Timm

textpress

A Lightweight and Versatile NLP Toolkit

sem_search_corpus function

<dl><dt>tif</dt>
<dd>A data frame or data.table containing the text corpus.</dd>
<dt>text_hierarchy</dt>
<dd>A character vector indicating the column(s) by which to group the data.</dd>
<dt>search</dt>
<dd>The search pattern or query.</dd>
<dt>context_size</dt>
<dd>Numeric, default 0. Specifies the context size, in sentences, around the found patterns.</dd>
<dt>is_inline</dt>
<dd>Logical, default FALSE. Indicates if the search should be inline.</dd>
<dt>highlight</dt>
<dd>A character vector of length two, default c('&lt;b&gt;', '&lt;/b&gt;').
Used to highlight the found patterns in the text.</dd>
<dt>cores</dt>
<dd>Numeric, default 1. The number of cores to use for parallel processing.</dd></dl>

Arguments

NLP Search Corpus — sem_search_corpus

<dl>

<dt>tif</dt>
<dd>A data frame or data.table containing the text corpus.</dd>


<dt>text_hierarchy</dt>
<dd>A character vector indicating the column(s) by which to group the data.</dd>


<dt>search</dt>
<dd>The search pattern or query.</dd>


<dt>context_size</dt>
<dd>Numeric, default 0. Specifies the context size, in sentences, around the found patterns.</dd>


<dt>is_inline</dt>
<dd>Logical, default FALSE. Indicates if the search should be inline.</dd>


<dt>highlight</dt>
<dd>A character vector of length two, default c('&lt;b&gt;', '&lt;/b&gt;').
Used to highlight the found patterns in the text.</dd>


<dt>cores</dt>
<dd>Numeric, default 1. The number of cores to use for parallel processing.</dd>

</dl>

sem_search_corpus: NLP Search Corpus

Description

Usage

Value

Arguments

Examples