tm (version 0.5-10)

tm_map: Transformations on Corpora

Description

Interface to apply transformation functions (also denoted as mappings) to corpora.

Usage

## S3 method for class 'PCorpus':
tm_map(x, FUN, \dots, useMeta = FALSE, lazy = FALSE)
## S3 method for class 'VCorpus':
tm_map(x, FUN, \dots, useMeta = FALSE, lazy = FALSE)

Arguments

x
A corpus.
FUN
A transformation function returning a text document.
...
Arguments to FUN.
useMeta
Logical. Should DMetaData be passed over to FUN as argument?
lazy
Logical. Lazy mappings are mappings which are delayed until the documents' content is accessed. Lazy mapping is useful when working with large corpora but only few documents will be accessed, as it avoids the computationally expensive applicat

Value

  • A corpus with FUN applied to each document in x. In case of lazy mappings only annotations are stored which are evaluated upon access of individual documents which trigger the execution of the corresponding transformation function.

See Also

getTransformations for available transformations, and materialize for manually triggering the materialization of documents with pending lazy transformations.

Examples

Run this code
data("crude")
tm_map(crude, stemDocument)
## Generate a custom transformation function which takes the heading
## as new content
headings <- function(x)
    PlainTextDocument(Heading(x), id = ID(x), language = Language(x))
inspect(tm_map(crude, headings))

Run the code above in your browser using DataLab