tm (version 0.5-10)

PCorpus: Permanent Corpus Constructor

Description

Construct a permanent corpus.

Usage

PCorpus(x,
        readerControl = list(reader = x$DefaultReader, language = "en"),
        dbControl = list(dbName = "", dbType = "DB1"))
DBControl(x)
## S3 method for class 'PCorpus':
DMetaData(x)

Arguments

x
A Source object for PCorpus, and a corpus for the other functions.
readerControl
A list with the named components reader representing a reading function capable of handling the file format found in x, and language giving the text's language (preferably as IETF langu
dbControl
A list with the named components dbName giving the filename holding the sourced out documents (i.e., the database), and dbType holding a valid database type as supported by package filehash. Under activated

Value

  • An object of class PCorpus which extends the classes Corpus and list containing a permanent corpus.

Details

Permanent means that documents are physically stored outside of R(e.g., in a database) and Robjects are only pointers to external structures. I.e., changes in the underlying external representation can affect multiple Robjects simultaneously. The constructed corpus object inherits from a list and has three attributes containing meta and database management information: [object Object],[object Object],[object Object]

Examples

Run this code
txt <- system.file("texts", "txt", package = "tm")
PCorpus(DirSource(txt),
        dbControl = list(dbName = "myDB.db", dbType = "DB1"))

Run the code above in your browser using DataLab