Learn R Programming

⚠️There's a newer version (0.10.2) of this package.Take me there.

corpus (version 0.9.1)

Text Corpus Analysis

Description

Text corpus data analysis, with full support for Unicode. Functions for reading data from newline-delimited JSON files, for normalizing and tokenizing text, for searching for term occurrences, and for computing term occurrence frequencies (including n-grams).

Copy Link

Version

Install

install.packages('corpus')

Monthly Downloads

278

Version

0.9.1

License

Apache License (== 2.0) | file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Patrick Perry

Last Published

August 20th, 2017

Functions in corpus (0.9.1)

abbreviations

Abbreviations
affect_wordnet

WordNet-Affect Lexicon
as_text

Text Vectors
corpus-deprecated

Deprecated Functions in Package corpus
corpus_frame

Corpus Data Frame
federalist

The Federalist Papers
read_ndjson

JSON Data Input
stopwords

Stop Words
corpus-package

The Corpus Package
corpus

Corpus Objects
text_split

Segmenting Text
text_stats

Text Statistics
text_types

Text Type Sets.
utf8

UTF-8 Text Handling
term_matrix

Term Frequency Tabulation
term_stats

Term Stats
text_sub

Text Subsequences
text_tokens

Text Tokenization
text_filter

Text Filters
text_locate

Searching for terms in text.