powered by
Import a corpus from a file.
import_corpus(paths, format, language, textcolumn = 1, encoding = NULL)
Path to one of more files, or to a directory (if format="txt") to import.
format="txt"
File format: can be "csv", "txt", "factiva", "europresse", "lexisnexis" or "alceste".
"csv"
"txt"
"factiva"
"europresse"
"lexisnexis"
"alceste"
The language name or code (preferably as IETF language tags, see language) to be used in particular for stopwords and stemming.
language
When format="csv", the column containing the text, either as a string or as a position
format="csv"
The character encoding of the file, or NULL to attempt automatic detection.
NULL
A Corpus object.
Corpus
# NOT RUN { file <- system.file("texts", "reut21578-factiva.xml", package="tm.plugin.factiva") import_corpus(file, "factiva", language="en") # }
Run the code above in your browser using DataLab