A convenience function to read in a large file piece by piece, process it (hopefully reducing the size either by summarizing or removing extra rows or columns) and return the output
streamingRead(
bigFile,
n = 1e+06,
FUN = function(xx) sub(",.*", "", xx),
...,
vocal = FALSE
)
a list containing the results from applying func to the multiple chunks of the file
a string giving the path to a file to be read in or a connection opened with "r" mode
number of lines to read per chunk
a function taking the unparsed lines from a chunk of the bigfile as a single argument and returning the desired output
any additional arguments to FUN
if TRUE cat a "." as each chunk is processed
tmpFile<-tempfile()
writeLines(LETTERS,tmpFile)
streamingRead(tmpFile,10,head,1)
writeLines(letters,tmpFile)
streamingRead(tmpFile,2,paste,collapse='',vocal=TRUE)
unlist(streamingRead(tmpFile,2,sample,1))
Run the code above in your browser using DataLab