make.readchunk(input, FUN = identity, chunksize = 5000L)
reset
. If this argument
is TRUE
, it indicates that the data should be reread from the
beginning by subsequent calls. When it reads all the data, it automatically
resets the file. This function returns the value of FUN
applied to
the chunk. By default, the chunk is returned as a
tbl_df
object.
input
usings the
fread
function. The input
is characterized
in the help page of fread
. The data contained in the
input
reference should not have any header.This function is inspired by the bigglm
example.
bigglm
, fread
, tbl_df
## Not run:
# library(hflights)
# nrow(hflights) # Number of rows
#
# ## We create a file with no header
# input <- "hflights.csv"
# write.table(hflights,file=input,sep=",",
# row.names=FALSE,col.names=FALSE)
#
# ## Get the number of rows of each chunk
# readchunk <- make.readchunk(input,FUN=function(x){NROW(x)})
#
# a <- NULL
# while(!is.null(b <- readchunk())) {
# if(is.null(a)) {
# a <- b
# } else {
# a <- a+b
# }
# }
# all.equal(a, nrow(hflights))
#
# ## It resets automatically the file
# a <- NULL
# while(!is.null(b <- readchunk())) {
# if(is.null(a)) {
# a <- b
# } else {
# a <- a+b
# }
# }
# all.equal(a, nrow(hflights))
# ## End(Not run)
Run the code above in your browser using DataLab