Learn R Programming

text2map: R Tools for Text Matrices

This is an R Package with libraries and utility functions for computational text analysis.

The functions are optimized for working with various kinds of text matrices. Focusing on the text matrix as the primary object -- which is represented either as a base R dense matrix or a Matrix package sparse matrix -- allows for a consistent and intuitive interface that stays close to the underlying mathematical foundation of computational text analysis. In particular, the package includes functions for working with word embeddings, text networks, and document-term matrices.

Please check out our book Mapping Texts: Computational Text Analysis for the Social Sciences

Installation

Install the CRAN version:

install.packages("text2map")

Or install the latest development version from GitLab:

library(remotes)
install_gitlab("culturalcartography/text2map")

Related Packages

There are four related packages hosted on GitLab:

The above packages can be installed using the following:

library(remotes)
install_gitlab("culturalcartography/text2map.theme")
install_gitlab("culturalcartography/text2map.corpora")
install_gitlab("culturalcartography/text2map.pretrained")
install_gitlab("culturalcartography/text2map.dictionaries")

Contributions and Support

We welcome contributions!

For any contributions, feel free to fork the package repository on GitLab or submit pull requests. We follow the Tidyverse and rOpensci style guides (see also Advanced R). In terms of adding functions, we encourage any method that works with base R matrices or the Matrix package's dgCMatrix class.

Please report any issues or bugs here: https://gitlab.com/culturalcartography/text2map/-/issues

Any questions and requests for support can also be directed to the package maintainers (maintainers [at] textmapping [dot] com).

Copy Link

Version

Install

install.packages('text2map')

Monthly Downloads

331

Version

0.2.0

License

MIT + file LICENSE

Maintainer

Dustin Stoltz

Last Published

April 11th, 2024

Functions in text2map (0.2.0)

get_direction

Word embedding semantic direction extractor
print.CoCA

Prints CoCA class information
seq_builder

Represent Documents as Token-Integer Sequences
rancors_builder

Build Multiple Random Corpora
perm_tester

Monte Carlo Permutation Tests for Model P-Values
plot.CoCA

Plot CoCA
vocab_builder

A fast unigram vocabulary builder
rancor_builder

Build a Random Corpus
jfk_speech

Full Text of JFK's Rice Speech
test_anchors

Evaluate anchor sets in defining semantic directions
meta_shakespeare

Metadata for Shakespeare's First Folio
tiny_gender_tagger

A very tiny "gender" tagger
stoplists

A dataset of stoplists
text2map-package

Text2Map
dtm_resampler

Resamples an input DTM to generate new DTMs
dtm_builder

A fast unigram DTM builder
doc_centrality

Find a specified document centrality metric
ft_wv_sample

Sample of fastText embeddings
CMDist

Calculate Concept Mover's Distance
doc_similarity

Find a similarities between documents
Matrix

Import Matrix
get_anchors

Gets anchor terms from precompiled anchor lists
dtm_melter

Melt a DTM into a triplet data frame
CoCA

Performs Concept Class Analysis (CoCA)
find_rejection

Find the 'rejection matrix' from a semantic vector
find_transformation

Find a specified matrix transformation
anchor_lists

A dataset of anchor lists
get_centroid

Word embedding semantic centroid extractor
dtm_stopper

Removes terms from a DTM based on rules
find_projection

Find the 'projection matrix' to a semantic vector
dtm_stats

Gets DTM summary statistics
get_regions

Word embedding semantic region extractor
get_stoplist

Gets stoplist from precompiled lists