Learn R Programming

textmineR

Maintenance status

⚠️ Note: textmineR is now in maintenance-only mode.
The package remains functional and is kept on CRAN for reproducibility, but it is no longer under active development.

For new work, please see tidylda, which is the actively maintained successor to textmineR. tidylda provides a more modern interface and improved topic modeling functionality while following the tidyverse design philosophy.

textmineR

Functions for Text Mining and Topic Modeling

Copyright 2021 by Thomas W. Jones

An aid for text mining in R, with a syntax that is more familiar to experienced R users. Also, implements various functions related to topic modeling, making it a good topic modeling work bench.

textmineR was created with three principles in mind:

  1. Maximize interoperability within R's ecosystem
  2. Scaleable in terms of object storeage and computation time
  3. Syntax that is idiomatic to R

Please see the vignettes for more information on how to get started.

Note: there's a lot going on with textmineR at the moment, including adding functionality based on original research.

Copy Link

Version

Install

install.packages('textmineR')

Monthly Downloads

7,618

Version

3.0.6

License

MIT + file LICENSE

Maintainer

Thomas W Jones

Last Published

October 18th, 2025

Functions in textmineR (3.0.6)

Internals

Internal helper functions for textmineR
FitLsaModel

Fit a topic model using Latent Semantic Analysis
TermDocFreq

Get term frequencies and document frequencies from a document term matrix.
nih

Abstracts and metadata from NIH research grants awarded in 2014
posterior

Posterior methods for topic models
TmParallelApply

posterior.lda_topic_model

Draw from the posterior of an LDA topic model
predict.ctm_topic_model

Predict method for Correlated topic models (CTM)
textmineR

textmineR
update

Update methods for topic models
textmineR-deprecated

Deprecated functions in package textmineR.
update.lda_topic_model

Update a Latent Dirichlet Allocation topic model with new data
predict.lsa_topic_model

Predict method for LSA topic models
predict.lda_topic_model

Get predictions from a Latent Dirichlet Allocation model
CreateTcm

Convert a character vector to a term co-occurrence matrix.
Cluster2TopicModel

Represent a document clustering as a topic model
Dtm2Docs

Convert a DTM to a Character Vector of documents
CalcHellingerDist

Calculate Hellinger Distance
CreateDtm

Convert a character vector to a document term matrix.
CalcTopicModelR2

Calculate the R-squared of a topic model.
CalcLikelihood

Calculate the log likelihood of a document term matrix given a topic model
CalcProbCoherence

Probabilistic coherence of topics
CalcJSDivergence

Calculate Jensen-Shannon Divergence
CalcGamma

Calculate a matrix whose rows represent P(topic_i|tokens)
LabelTopics

Get some topic labels using a "more probable" method of terms
SummarizeTopics

Summarize topics in a topic model
FitLdaModel

Fit a Latent Dirichlet Allocation topic model
FitCtmModel

Fit a Correlated Topic Model
GetProbableTerms

Get cluster labels using a "more probable" method of terms
Dtm2Lexicon

Turn a document term matrix into a list for LDA Gibbs sampling
Dtm2Tcm

Turn a document term matrix into a term co-occurrence matrix
GetTopTerms

Get Top Terms for each topic from a topic model