Learn R Programming

⚠️There's a newer version (3.1.0) of this package.Take me there.

cleanNLP (version 2.3.0)

A Tidy Data Model for Natural Language Processing

Description

Provides a set of fast tools for converting a textual corpus into a set of normalized tables. Users may make use of the 'udpipe' back end with no external dependencies, a Python back end with 'spaCy' or the Java back end 'CoreNLP' . Exposed annotation tasks include tokenization, part of speech tagging, named entity recognition, entity linking, sentiment analysis, dependency parsing, coreference resolution, and word embeddings. Summary statistics regarding token unigram, part of speech tag, and dependency type frequencies are also included to assist with analyses.

Copy Link

Version

Install

install.packages('cleanNLP')

Monthly Downloads

452

Version

2.3.0

License

LGPL-2

Maintainer

Taylor Arnold

Last Published

November 18th, 2018

Functions in cleanNLP (2.3.0)

Run the annotation pipeline on a set of documents

cnlp_get_vector

Access word embedding vector from an annotation object

cnlp_init_corenlp

Interface for initializing the corenlp backend

Write annotation files to disk

cnlp_write_conll

Returns a CoNLL-U Document

cnlp_init_udpipe

Interface for initializing the udpipe backend

Quickly Compute Data Frame of Annotations

Compute Principal Components and store as a Data Frame

cnlp_init_spacy

Interface for initializing the spacy backend

cnlp_utils_tfidf

Construct the TF-IDF Matrix from Annotation or Data Frame

cnlp_read_conll

Reads a CoNLL-U or CoNLL-X File

cnlp_init_tokenizers

Interface for initializing the tokenizers backend

Read annotation files from disk

Universal Part of Speech Code Frequencies

cnlp_get_sentence

Access sentence-level annotations

print.annotation

Print a summary of an annotation object

Renamed functions

Access tokens from an annotation object

Universal Declaration of Human Rights

Universal Dependency Frequencies

Annotation of Barack Obama's State of the Union Addresses

Most frequent English words

cleanNLP-package

cleanNLP: A Tidy Data Model for Natural Language Processing

cnlp_download_udpipe

Download model files needed for udpipe

cnlp_extract_documents

Extract documents from an annotation object

cnlp_get_dependency

Access dependencies from an annotation object

cnlp_get_coreference

Access coreferences from an annotation object

cnlp_get_document

Access document meta data from an annotation object

cnlp_get_entity

Access named entities from an annotation object

cnlp_combine_documents

Combine a set of annotations

cnlp_download_corenlp

Download java files needed for CoreNLP