Learn R Programming

⚠️There's a newer version (0.10.2) of this package.Take me there.

corpus (version 0.6.0)

Text Corpus Analysis

Description

Text corpus data analysis, with full support for Unicode. Functions for reading data from newline-delimited JSON files, for normalizing and tokenizing text, and for computing term occurrence frequencies (including n-grams).

Copy Link

Version

Install

install.packages('corpus')

Monthly Downloads

218

Version

0.6.0

License

Apache License (== 2.0) | file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Patrick Perry

Last Published

June 6th, 2017

Functions in corpus (0.6.0)

read_ndjson

JSON Data Input
stopwords

Stop Words
term_counts

Term Frequencies
term_matrix

Term Frequency Matrix
text

Text Vectors
text_split

Segmenting Text
abbreviations

Abbreviations
corpus-package

The Corpus Package
tokens

Text Tokenization