nlp_split_sentences

This function splits text from a data frame into individual sentences based on specified columns and handles abbreviations effectively.

A simple Natural Language Processing (NLP) toolkit focused on search-centric workflows with minimal dependencies. The package offers key features for web scraping, text processing, corpus search, and text embedding generation via the 'HuggingFace API' <https://huggingface.co/docs/api-inference/index>.

Jason Timm

textpress

A Lightweight and Versatile NLP Toolkit

nlp_split_sentences function

<dl><dt>tif</dt>
<dd>A data frame containing text to be split into sentences.</dd>
<dt>text_hierarchy</dt>
<dd>A character vector specifying the columns to group by for sentence splitting, usually 'doc_id'.</dd>
<dt>abbreviations</dt>
<dd>A character vector of abbreviations to handle during sentence splitting, defaults to textpress::abbreviations.</dd></dl>

Arguments

Split Text into Sentences — nlp_split_sentences

<dl>

<dt>tif</dt>
<dd>A data frame containing text to be split into sentences.</dd>


<dt>text_hierarchy</dt>
<dd>A character vector specifying the columns to group by for sentence splitting, usually 'doc_id'.</dd>


<dt>abbreviations</dt>
<dd>A character vector of abbreviations to handle during sentence splitting, defaults to textpress::abbreviations.</dd>

</dl>

nlp_split_sentences: Split Text into Sentences

Description

Usage

Value

Arguments

Examples