tm v0.7-6


Monthly downloads



Text Mining Package

A framework for text mining applications within R.

Functions in tm

Name Description
Source Sources
DirSource Directory Source
Docs Access Document IDs and Terms
TextDocument Text Documents
Zipf_n_Heaps Explore Corpus Term Frequency Characteristics
ZipSource ZIP File Source
VectorSource Vector Source
WeightFunction Weighting Function
Reader Readers
content_transformer Content Transformers
SimpleCorpus Simple Corpora
URISource Uniform Resource Identifier Source
VCorpus Volatile Corpora
stemDocument Stem Words
crude 20 Exemplary News Articles from the Reuters-21578 Data Set of Topic crude
readPlain Read In a Text Document
findAssocs Find Associations in a Term-Document Matrix
findFreqTerms Find Frequent Terms
weightSMART SMART Weightings
stopwords Stopwords
readRCV1 Read In a Reuters Corpus Volume 1 Document
weightTf Weight by Term Frequency
readXML Read In an XML Document
removeNumbers Remove Numbers from a Text Document
plot Visualize a Term-Document Matrix
weightTfIdf Weight by Term Frequency - Inverse Document Frequency
readDOC Read In a MS Word Document
writeCorpus Write a Corpus to Disk
Corpus Corpora
removeWords Remove Words from a Text Document
stemCompletion Complete Stems
tm_reduce Combine Transformations
tm_term_score Compute Score for Matching Terms
acq 50 Exemplary News Articles from the Reuters-21578 Data Set of Topic acq
tm_combine Combine Corpora, Documents, Term-Document Matrices, and Term Frequency Vectors
DataframeSource Data Frame Source
getTokenizers Tokenizers
hpc Parallelized ‘lapply’
getTransformations Transformations
readReut21578XML Read In a Reuters-21578 XML Document
inspect Inspect Objects
readDataframe Read In a Text Document from a Data Frame
readTagged Read In a POS-Tagged Word Text Document
readPDF Read In a PDF Document
tokenizer Tokenizers
removePunctuation Remove Punctuation Marks from a Text Document
weightBin Weight Binary
removeSparseTerms Remove Sparse Terms from a Term-Document Matrix
tm_filter Filter and Index Functions on Corpora
PCorpus Permanent Corpora
tm_map Transformations on Corpora
PlainTextDocument Plain Text Documents
XMLSource XML Source
XMLTextDocument XML Text Documents
findMostFreqTerms Find Most Frequent Terms
TermDocumentMatrix Term-Document Matrix
foreign Read Document-Term Matrices
meta Metadata Management
stripWhitespace Strip Whitespace from a Text Document
termFreq Term Frequency Vector
No Results!

Vignettes of tm

No Results!

Last month downloads


Date 2018-12-21
LinkingTo BH, Rcpp
SystemRequirements C++11
License GPL-3
NeedsCompilation yes
Packaged 2018-12-21 13:10:14 UTC; hornik
Repository CRAN
Date/Publication 2018-12-21 13:55:26 UTC

Include our badge in your README