Learn R Programming

⚠️There's a newer version (0.7-16) of this package.Take me there.

tm (version 0.5-5)

Text Mining Package

Description

A framework for text mining applications within R.

Copy Link

Version

Install

install.packages('tm')

Monthly Downloads

64,484

Version

0.5-5

License

GPL (>= 2)

Maintainer

Ingo Feinerer

Last Published

February 20th, 2011

Functions in tm (0.5-5)

VCorpus

Volatile Corpus
tm_combine

Combine Corpora, Documents, and Term-Document Matrices
TextDocument

Access and Modify Text Documents
acq

50 Exemplary News Articles from the Reuters-21578 XML Data Set of Topic acq
PCorpus

Permanent Corpus Constructor
WeightFunction

Weighting Function
getSources

List Available Sources
materialize

Materialize Lazy Mappings
prescindMeta

Prescind Document Meta Data
readXML

Read In an XML Document
XMLSource

XML Source
Reuters21578Document

Reuters-21578 Text Document
Source

Access Sources
stemDocument

Stem Words
tm_cluster

Allow `tm' to Use a Cluster
tm_map

Transformations on Corpora
readDOC

Read In a MS Word Document
findAssocs

Find Associations in a Term-Document Matrix
makeChunks

Split a Corpus into Chunks
searchFullText

Full Text Search
writeCorpus

Write a Corpus to Disk
dissimilarity

Dissimilarity
DataframeSource

Data Frame Source
weightTfIdf

Weight by Term Frequency - Inverse Document Frequency
ReutersSource

Reuters-21578 XML Source
DirSource

Directory Source
PlainTextDocument

Plain Text Document
TermDocumentMatrix

Term-Document Matrix
FunctionGenerator

Function Generator
VectorSource

Vector Source
URISource

Uniform Resource Identifier Source
Zipf_n_Heaps

Explore Corpus Term Frequency Characteristics
GmaneSource

Gmane Source
removeNumbers

Remove Numbers from a Text Document
readPlain

Read In a Text Document
tm_tag_score

Compute a Tag Score
readPDF

Read In a PDF Document
TextRepository

Text Repository
preprocessReut21578XML

Preprocess the Reuters-21578 XML archive.
stemCompletion

Complete Stems
names

Row, Column, Dim Names, Document IDs, and Terms
tm_filter

Filter and Index Functions on Corpora
getFilters

List Available Filters
stopwords

Multilingual Stopwords
inspect

Inspect Objects
readReut21578XML

Read In a Reuters-21578 XML Document
removeWords

Remove Words from a Text Document
stripWhitespace

Strip Whitespace from a Text Document
readTabular

Read In a Text Document
Dictionary

Dictionary
getReaders

List Available Readers
crude

20 Exemplary News Articles from the Reuters-21578 XML Data Set of Topic crude
RCV1Document

RCV1 Text Document
meta

Meta Data Management
plot

Visualize a Term-Document Matrix
tm_reduce

Combine Transformations
number

The Number of Rows/Columns/Dimensions/Documents/Terms of a Term-Document Matrix
readGmane

Read In a Gmane RSS Feed
tm_intersect

Intersection between Documents and Words
removePunctuation

Remove Punctuation Marks from a Text Document
as.PlainTextDocument

Create Objects of Class PlainTextDocument
findFreqTerms

Find Frequent Terms
readRCV1

Read In a Reuters Corpus Volume 1 Document
getTransformations

List Available Transformations
removeSparseTerms

Remove Sparse Terms from a Term-Document Matrix
sFilter

Statement Filter
weightTf

Weight by Term Frequency
termFreq

Term Frequency Vector
weightBin

Weight Binary