tm v0.7-3
0
Monthly downloads
Text Mining Package
A framework for text mining applications within R.
Functions in tm
Name | Description | |
Docs | Access Document IDs and Terms | |
SimpleCorpus | Simple Corpora | |
Corpus | Corpora | |
TextDocument | Text Documents | |
DataframeSource | Data Frame Source | |
Reader | Readers | |
Source | Sources | |
PCorpus | Permanent Corpora | |
URISource | Uniform Resource Identifier Source | |
PlainTextDocument | Plain Text Documents | |
DirSource | Directory Source | |
VCorpus | Volatile Corpora | |
VectorSource | Vector Source | |
findAssocs | Find Associations in a Term-Document Matrix | |
WeightFunction | Weighting Function | |
findFreqTerms | Find Frequent Terms | |
findMostFreqTerms | Find Most Frequent Terms | |
foreign | Read Document-Term Matrices | |
acq | 50 Exemplary News Articles from the Reuters-21578 Data Set of Topic acq | |
readPlain | Read In a Text Document | |
tm_combine | Combine Corpora, Documents, Term-Document Matrices, and Term Frequency Vectors | |
readRCV1 | Read In a Reuters Corpus Volume 1 Document | |
ZipSource | ZIP File Source | |
content_transformer | Content Transformers | |
Zipf_n_Heaps | Explore Corpus Term Frequency Characteristics | |
crude | 20 Exemplary News Articles from the Reuters-21578 Data Set of Topic crude | |
readDataframe | Read In a Text Document from a Data Frame | |
hpc | Parallelized ‘lapply’ | |
readPDF | Read In a PDF Document | |
readXML | Read In an XML Document | |
inspect | Inspect Objects | |
removeNumbers | Remove Numbers from a Text Document | |
weightSMART | SMART Weightings | |
plot | Visualize a Term-Document Matrix | |
readDOC | Read In a MS Word Document | |
XMLSource | XML Source | |
removePunctuation | Remove Punctuation Marks from a Text Document | |
XMLTextDocument | XML Text Documents | |
tokenizer | Tokenizers | |
getTokenizers | Tokenizers | |
weightBin | Weight Binary | |
removeWords | Remove Words from a Text Document | |
getTransformations | Transformations | |
stemCompletion | Complete Stems | |
readReut21578XML | Read In a Reuters-21578 XML Document | |
tm_filter | Filter and Index Functions on Corpora | |
tm_map | Transformations on Corpora | |
weightTf | Weight by Term Frequency | |
removeSparseTerms | Remove Sparse Terms from a Term-Document Matrix | |
TermDocumentMatrix | Term-Document Matrix | |
readTagged | Read In a POS-Tagged Word Text Document | |
meta | Metadata Management | |
stemDocument | Stem Words | |
stopwords | Stopwords | |
termFreq | Term Frequency Vector | |
tm_reduce | Combine Transformations | |
weightTfIdf | Weight by Term Frequency - Inverse Document Frequency | |
tm_term_score | Compute Score for Matching Terms | |
writeCorpus | Write a Corpus to Disk | |
stripWhitespace | Strip Whitespace from a Text Document | |
No Results! |
Vignettes of tm
Name | ||
extensions.Rnw | ||
references.bib | ||
tm.Rnw | ||
No Results! |
Last month downloads
Details
Date | 2017-12-06 |
LinkingTo | BH, Rcpp |
SystemRequirements | C++11 |
License | GPL-3 |
URL | http://tm.r-forge.r-project.org/ |
Additional_repositories | http://datacube.wu.ac.at |
NeedsCompilation | yes |
Packaged | 2017-12-06 09:38:32 UTC; hornik |
Repository | CRAN |
Date/Publication | 2017-12-06 18:26:44 UTC |
suggests | antiword , filehash , methods , pdftools , Rcampdf , Rgraphviz , Rpoppler , SnowballC , testthat , tm.lexicon.GeneralInquirer |
linkingto | BH |
imports | graphics , parallel , Rcpp , slam (>= 0.1-37) , stats , tools , utils , xml2 |
depends | NLP (>= 0.1-6.2) , R (>= 3.2.0) |
Contributors | Kurt Hornik, Artifex Software, Inc. |
Include our badge in your README
[](http://www.rdocumentation.org/packages/tm)