tm v0.7-1


Monthly downloads



Text Mining Package

A framework for text mining applications within R.

Functions in tm

Name Description
DirSource Directory Source
Docs Access Document IDs and Terms
Corpus Corpora
DataframeSource Data Frame Source
Source Sources
TextDocument Text Documents
PCorpus Permanent Corpora
PlainTextDocument Plain Text Documents
Reader Readers
SimpleCorpus Simple Corpora
VectorSource Vector Source
WeightFunction Weighting Function
findMostFreqTerms Find Most Frequent Terms
ZipSource ZIP File Source
Zipf_n_Heaps Explore Corpus Term Frequency Characteristics
hpc Parallelized ‘lapply’
inspect Inspect Objects
plot Visualize a Term-Document Matrix
readDOC Read In a MS Word Document
readXML Read In an XML Document
acq 50 Exemplary News Articles from the Reuters-21578 Data Set of Topic acq
tm_combine Combine Corpora, Documents, Term-Document Matrices, and Term Frequency Vectors
content_transformer Content Transformers
foreign Read Document-Term Matrices
readPDF Read In a PDF Document
readPlain Read In a Text Document
stripWhitespace Strip Whitespace from a Text Document
XMLSource XML Source
XMLTextDocument XML Text Documents
findAssocs Find Associations in a Term-Document Matrix
findFreqTerms Find Frequent Terms
TermDocumentMatrix Term-Document Matrix
meta Metadata Management
removePunctuation Remove Punctuation Marks from a Text Document
termFreq Term Frequency Vector
tm_filter Filter and Index Functions on Corpora
tm_map Transformations on Corpora
removeSparseTerms Remove Sparse Terms from a Term-Document Matrix
weightTfIdf Weight by Term Frequency - Inverse Document Frequency
writeCorpus Write a Corpus to Disk
URISource Uniform Resource Identifier Source
VCorpus Volatile Corpora
getTokenizers Tokenizers
getTransformations Transformations
readTabular Read In a Text Document
crude 20 Exemplary News Articles from the Reuters-21578 Data Set of Topic crude
readRCV1 Read In a Reuters Corpus Volume 1 Document
readReut21578XML Read In a Reuters-21578 XML Document
removeNumbers Remove Numbers from a Text Document
tokenizer Tokenizers
weightBin Weight Binary
removeWords Remove Words from a Text Document
stemCompletion Complete Stems
tm_reduce Combine Transformations
tm_term_score Compute Score for Matching Terms
readTagged Read In a POS-Tagged Word Text Document
stemDocument Stem Words
stopwords Stopwords
weightSMART SMART Weightings
weightTf Weight by Term Frequency
No Results!

Vignettes of tm

No Results!

Last month downloads


Date 2017-03-02
LinkingTo BH, Rcpp
SystemRequirements Antiword () for reading MS Word files, pdfinfo and pdftotext from Poppler () for reading PDF
License GPL-3
NeedsCompilation yes
Packaged 2017-03-02 14:36:20 UTC; hornik
Repository CRAN
Date/Publication 2017-03-02 17:45:01

Include our badge in your README