Learn R Programming

stringb (version 0.1.17)

text_read: read in text

Description

A wrapper to readLines() to make things more ordered and convenient. In comparison to the wrapped up readLines() function text_read() does some things differently: (1) If no encoding is given, it will always assume files are stored in UTF-8 instead of the system locale. (2) it will always converts text to UTF-8 instead of transforming it to the system locale. (3) in addition to loading, it offers to tokenize the text using a regular expression or NULL for no tokenization at all.

Usage

text_read(file, tokenize = "\n", encoding = "UTF-8", ...)

Arguments

file

name or path to the file to be read in or a connections object (see readLines)

tokenize

either NULL so that no splitting is done; a regular expression to use to split text into parts; or a function that does the splitting (or whatever other transformation)

encoding

character encoding of file passed throught to readLines

...

further arguments passed through to readLines like: n, ok, warn, skipNul