R.temis (version 0.1.3)

import_corpus: import_corpus

Description

Import a corpus from a file.

Usage

import_corpus(paths, format, language, textcolumn = 1, encoding = NULL)

Arguments

paths

Path to one of more files, or to a directory (if format="txt") to import.

format

File format: can be "csv", "txt", "factiva", "europresse", "lexisnexis" or "alceste".

language

The language name or code (preferably as IETF language tags, see language) to be used in particular for stopwords and stemming.

textcolumn

When format="csv", the column containing the text, either as a string or as a position

encoding

The character encoding of the file, or NULL to attempt automatic detection.

Value

A Corpus object.

Examples

Run this code
# NOT RUN {
file <- system.file("texts", "reut21578-factiva.xml", package="tm.plugin.factiva")
import_corpus(file, "factiva", language="en")

# }

Run the code above in your browser using DataCamp Workspace