Learn R Programming

⚠️There's a newer version (0.7-14) of this package.Take me there.

tm (version 0.7-3)

Text Mining Package

Description

A framework for text mining applications within R.

Copy Link

Version

Install

install.packages('tm')

Monthly Downloads

39,688

Version

0.7-3

License

GPL-3

Maintainer

Last Published

December 6th, 2017

Functions in tm (0.7-3)

Docs

Access Document IDs and Terms
SimpleCorpus

Simple Corpora
Corpus

Corpora
TextDocument

Text Documents
DataframeSource

Data Frame Source
Reader

Readers
Source

Sources
PCorpus

Permanent Corpora
URISource

Uniform Resource Identifier Source
PlainTextDocument

Plain Text Documents
DirSource

Directory Source
VCorpus

Volatile Corpora
VectorSource

Vector Source
findAssocs

Find Associations in a Term-Document Matrix
WeightFunction

Weighting Function
findFreqTerms

Find Frequent Terms
findMostFreqTerms

Find Most Frequent Terms
foreign

Read Document-Term Matrices
acq

50 Exemplary News Articles from the Reuters-21578 Data Set of Topic acq
readPlain

Read In a Text Document
tm_combine

Combine Corpora, Documents, Term-Document Matrices, and Term Frequency Vectors
readRCV1

Read In a Reuters Corpus Volume 1 Document
ZipSource

ZIP File Source
content_transformer

Content Transformers
Zipf_n_Heaps

Explore Corpus Term Frequency Characteristics
crude

20 Exemplary News Articles from the Reuters-21578 Data Set of Topic crude
readDataframe

Read In a Text Document from a Data Frame
hpc

Parallelized ‘lapply’
readPDF

Read In a PDF Document
readXML

Read In an XML Document
inspect

Inspect Objects
removeNumbers

Remove Numbers from a Text Document
weightSMART

SMART Weightings
plot

Visualize a Term-Document Matrix
readDOC

Read In a MS Word Document
XMLSource

XML Source
removePunctuation

Remove Punctuation Marks from a Text Document
XMLTextDocument

XML Text Documents
tokenizer

Tokenizers
getTokenizers

Tokenizers
weightBin

Weight Binary
removeWords

Remove Words from a Text Document
getTransformations

Transformations
stemCompletion

Complete Stems
readReut21578XML

Read In a Reuters-21578 XML Document
tm_filter

Filter and Index Functions on Corpora
tm_map

Transformations on Corpora
weightTf

Weight by Term Frequency
removeSparseTerms

Remove Sparse Terms from a Term-Document Matrix
TermDocumentMatrix

Term-Document Matrix
readTagged

Read In a POS-Tagged Word Text Document
meta

Metadata Management
stemDocument

Stem Words
stopwords

Stopwords
termFreq

Term Frequency Vector
tm_reduce

Combine Transformations
weightTfIdf

Weight by Term Frequency - Inverse Document Frequency
tm_term_score

Compute Score for Matching Terms
writeCorpus

Write a Corpus to Disk
stripWhitespace

Strip Whitespace from a Text Document