powered by
Import a corpus from a file.
import_corpus(paths, format, language, textcolumn = 1, encoding = NULL)
A Corpus object.
Corpus
Path to one of more files, or to a directory (if format="txt") to import.
format="txt"
File format: can be "csv", "txt", "factiva", "europresse", "lexisnexis" or "alceste".
"csv"
"txt"
"factiva"
"europresse"
"lexisnexis"
"alceste"
The language name or code (preferably as IETF language tags, see language) to be used in particular for stopwords and stemming.
language
When format="csv", the column containing the text, either as a string or as a position
format="csv"
The character encoding of the file, or NULL to attempt automatic detection.
NULL
file <- system.file("texts", "reut21578-factiva.xml", package="tm.plugin.factiva") import_corpus(file, "factiva", language="en")
Run the code above in your browser using DataLab