corpus-package: The Corpus Package

Description

Text corpus analysis functions

Arguments

Details

This package contains functions for text corpus analysis. To create a text object, use the read_ndjson or as_text function. To split text into sentences or token blocks, use the text_split function. To specify preprocessing behavior for transforming a text into a token sequence, use the token_filter function. To tokenize text or compute term frequencies, use the tokens, term_counts, or term_matrix function.

For a complete list of functions, use library(help = "corpus").