readTabular
From tm v0.6-2
by Ingo Feinerer
Read In a Text Document
Return a function which reads in a text document from a tabular data structure (like a data frame or a list matrix) with knowledge about its internal structure and possible available metadata as specified by a so-called mapping.
Usage
readTabular(mapping)
Arguments
- mapping
- A named list of characters. The
constructed reader will map each character entry to the content or
metadatum of the text document as specified by the named list
entry. Valid names include
content
to access the document's content, and character strings which are mapped to metadata entries.
Details
Formally this function is a function generator, i.e., it returns a function (which reads in a text document) with a well-defined signature, but can access passed over arguments (e.g., the mapping) via lexical scoping.
Value
-
A
elem
- a named list with the component
content
which must hold the document to be read in. language
- a string giving the language.
id
- a character giving a unique identifier for the created text document.
function
with the following formals:
PlainTextDocument
representing the text
and metadata extracted from elem$content
. The arguments language
and id
are used as fallback if no corresponding metadata entries are
found in elem$content
.
See Also
Reader
for basic information on the reader infrastructure
employed by package tm.
Vignette 'Extensions: How to Handle Custom File Formats'.
Examples
df <- data.frame(contents = c("content 1", "content 2", "content 3"),
title = c("title 1" , "title 2" , "title 3" ),
authors = c("author 1" , "author 2" , "author 3" ),
topics = c("topic 1" , "topic 2" , "topic 3" ),
stringsAsFactors = FALSE)
m <- list(content = "contents", heading = "title",
author = "authors", topic = "topics")
myReader <- readTabular(mapping = m)
ds <- DataframeSource(df)
elem <- getElem(stepNext(ds))
(result <- myReader(elem, language = "en", id = "id1"))
meta(result)
Community examples
Looks like there are no examples yet.