Learn R Programming

tm.plugin.dc (version 0.1-7)

Revisions: Revisions in a Distributed Corpora

Description

Each modification of the documents in the corpus results in a new stage, i.e., revision of the corpus. To allow fast switching between multiple revisions all modifications are kept on the file system. The replacement function setRevision() allows to go back to any stage in the history of the corpus.

Usage

getRevisions( corpus )
setRevision( corpus, revision )

Arguments

corpus
A distributed corpus.
revision
The revision which is to be set as active.

Value

  • Whereas getRevisions() returns a list of character strings naming all available revisions, setRevision() returns the distributed corpus with the given revision marked as active.

Examples

Run this code
## provide data on storage
data("crude")
dc <- as.DistributedCorpus(crude)
## do some preprocessing
dc <- tm_map(dc, tolower)
## retrieve available revisions
revs <- getRevisions(dc)
revs
## go back to original revision
setRevision(dc, revs[1])

Run the code above in your browser using DataLab