Kohei Watanabe

Kohei Watanabe

6 packages on CRAN

newsmap

cran
99.99th

Percentile

Semi-supervised model for geographical document classification (Watanabe 2018) <doi:10.1080/21670811.2017.1293487>. This package currently contains seed dictionaries in English, German, French, Spanish, Russian, Hebrew, Arabic Japanese and Chinese (Simplified and Traditional).

proxyC

cran
99.99th

Percentile

Computes proximity between rows or columns of large matrices efficiently in C++. Functions are optimised for large sparse matrices using the Armadillo and Intel TBB libraries. Among several built-in similarity/distance measures, computation of correlation, cosine similarity and Euclidean distance is particularly fast.

geoparser

cran
99.99th

Percentile

A wrapper for the Geoparser.io API version 0.4.0 (see <https://geoparser.io/>), which is a web service that identifies places mentioned in text, disambiguates those places, and returns detailed data about the places found in the text. Basic, limited API access is free with paid plans to accommodate larger workloads.

quanteda

cran
99.99th

Percentile

A fast, flexible, and comprehensive framework for quantitative text analysis in R. Provides functionality for corpus management, creating and manipulating tokens and ngrams, exploring keywords in context, forming and manipulating sparse matrices of documents by features and feature co-occurrences, analyzing keywords, computing feature similarities and distances, applying content dictionaries, applying supervised and unsupervised machine learning, visually representing text and text analyses, and more.

readtext

cran
99.99th

Percentile

Functions for importing and handling text files and formatted text files with additional meta-data, such including '.csv', '.tab', '.json', '.xml', '.html', '.pdf', '.doc', '.docx', '.rtf', '.xls', '.xlsx', and others.

stopwords

cran
99.99th

Percentile

Provides multiple sources of stopwords, for use in text analysis and natural language processing.