Learn R Programming

HadoopStreaming (version 0.2)

hsLineReader: A wrapper for readLines

Description

This function repeatedly reads chunkSize lines of data from file and passes a character vector of these strings to FUN. The first skip lines of input are ignored.

Usage

hsLineReader(file = "", chunkSize = -1, skip = 0, FUN = function(x) cat(x, sep = "\n"))

Arguments

file
A connection object or a character string, as in readLines.
chunkSize
The (maximal) number of lines to read at a time. The default is -1, which specifies that the whole file should be read at once.
skip
Number of lines to ignore at the beginning of the file
FUN
A function that takes a character vector as input

Value

No return value.

Details

Warning: A feature(?) of readLines is that if there is a newline before the EOF, an extra empty string is returned.

Examples

Run this code
  str <- "Hello here are some\nlines of text\nto read in, chunkSize\nlines at a time.\nHow interesting.\nhuh?"
  cat(str)
  con <- textConnection(str, open = "r")
  hsLineReader(con,chunkSize=-1,FUN=print)
  close(con)
  con <- textConnection(str, open = "r")
  hsLineReader(con,chunkSize=3,skip=1,FUN=print)
  close(con)

Run the code above in your browser using DataLab