tm (version 0.6-1)

PCorpus: Permanent Corpora

Description

Create permanent corpora.

Usage

PCorpus(x,
        readerControl = list(reader = reader(x), language = "en"),
        dbControl = list(dbName = "", dbType = "DB1"))

Arguments

x
A Source object.
readerControl
a named list of control parameters for reading in content from x. [object Object],[object Object]
dbControl
a named list of control parameters for the underlying database storage provided by package filehash. [object Object],[object Object]

Value

  • An object inheriting from PCorpus and Corpus.

Details

A permanent corpus stores documents outside of Rin a database. Since multiple PCorpus Robjects with the same underlying database can exist simultaneously in memory, changes in one get propagated to all corresponding objects (in contrast to the default Rsemantics).

See Also

Corpus for basic information on the corpus infrastructure employed by package tm.

VCorpus provides an implementation with volatile storage semantics.

Examples

txt <- system.file("texts", "txt", package = "tm")
PCorpus(DirSource(txt),
        dbControl = list(dbName = "pcorpus.db", dbType = "DB1"))