A convenience function to read in a large file piece by piece, process it (hopefully reducing the size either by summarizing or removing extra rows or columns) and return the output
Usage
streamingRead(bigFile, n = 1e+06, FUN = function(xx) sub(",.*", "", xx),
..., vocal = FALSE)
Arguments
bigFile
a string giving the path to a file to be read in or a connection opened with "r" mode
n
number of lines to read per chuck
FUN
a function taking the unparsed lines from a chunk of the bigfile as a single argument and returning the desired output
...
any additional arguments to FUN
vocal
if TRUE cat a "." as each chunk is processed
Value
a list containing the results from applying func to the multiple chunks of the file
# NOT RUN {streamingRead(textConnection(LETTERS),10,head,1)
temp<-tempfile()
writeLines(letters,temp)
streamingRead(temp,2,paste,collapse='',vocal=TRUE)
unlist(streamingRead(temp,2,sample,1))
# }