getReaders()
getReaders()
, a character vector with readers provided by package
Source
, and for constructing a
TextDocument
. A reader must accept following arguments in
its signature:
[object Object],[object Object],[object Object]
The element elem
is typically provided by a source whereas the language
and the identifier are normally provided by a corpus constructor (for the case
that elem$content
does not give information on these two essential
items). In case a reader expects configuration arguments we can use a function
generator. A function generator is indicated by inheriting from class
FunctionGenerator
and function
. It allows us to process
additional arguments, store them in an environment, return a reader function
with the well-defined signature described above, and still be able to access
the additional arguments via lexical scoping. All corpus constructors in
package
readDOC
, readPDF
, readPlain
,
readRCV1
, readRCV1asPlain
,
readReut21578XML
, readReut21578XMLasPlain
,
readTabular
, and readXML
.