tm (version 0.5-10)

VCorpus: Volatile Corpus

Description

Data structures and operators for volatile corpora.

Usage

Corpus(x, readerControl = list(reader = x$DefaultReader, language = "en"))
VCorpus(x, readerControl = list(reader = x$DefaultReader, language = "en"))
## S3 method for class 'VCorpus':
DMetaData(x)
## S3 method for class 'Corpus':
CMetaData(x)

Arguments

x
A Source object for Corpus and VCorpus, and a corpus for the other functions.
readerControl
A list with the named components reader representing a reading function capable of handling the file format found in x, and language giving the text's language (preferably as IETF langu

Value

  • An object of class VCorpus which extends the classes Corpus and list containing a collection of text documents.

Details

Volatile means that the corpus is fully kept in memory and thus all changes only affect the corresponding Robject. In contrast there is also a corpus implementation available providing a permanent semantics (see PCorpus).

The constructed corpus object inherits from a list and has two attributes containing meta information: [object Object],[object Object]

Examples

Run this code
reut21578 <- system.file("texts", "crude", package = "tm")
(r <- Corpus(DirSource(reut21578),
             readerControl = list(reader = readReut21578XMLasPlain)))

Run the code above in your browser using DataLab