textpress

textpress-package

A lightweight NLP toolkit for R organized as a four-stage pipeline: fetch
(URLs from search/Wikipedia), read (content from URLs), process (split,
tokenize, index), and search (regex, BM25, vector similarity, dictionary).
Uses verb_noun naming for discoverability. Minimal dependencies; embeddings
are built elsewhere and passed in for semantic search.

internal

A lightweight toolkit for text retrieval and NLP with a consistent and
predictable API organized around four actions: fetching, reading,
processing, and searching. Functions cover the full pipeline from web
data acquisition to text processing and indexing. Multiple search
strategies are supported including regex, BM25 keyword ranking, cosine
similarity, and dictionary matching. Pipe-friendly with no heavy
dependencies and all outputs are plain data frames. Also useful as a
building block for retrieval-augmented generation pipelines and
autonomous agent workflows.

Jason Timm

A Lightweight and Versatile NLP Toolkit

textpress-package function

Maintainer: Jason Timm <a href="/link/JaTimm%40salud.unm.edu?package=textpress&version=1.1.0" data-mini-rdoc="textpress::JaTimm@salud.unm.edu">JaTimm@salud.unm.edu</a> (2026)

Author

textpress: A Lightweight and Versatile NLP Toolkit — textpress-package

Maintainer: Jason Timm <a href='mailto:JaTimm@salud.unm.edu'>JaTimm@salud.unm.edu</a> (2026)

textpress-package: textpress: A Lightweight and Versatile NLP Toolkit

Description

Arguments

Author

See Also