powered by
Shard a data.frame/data.table or disk.frame into chunk and saves it into a disk.frame
`distribute` is an alias for `shard`
shard( df, shardby, outdir = tempfile(fileext = ".df"), ..., nchunks = recommend_nchunks(df), overwrite = FALSE )distribute(...)
distribute(...)
A data.frame/data.table or disk.frame. If disk.frame, then rechunk(df, ...) is run
The column(s) to shard the data by.
The output directory of the disk.frame
not used
The number of chunks
If TRUE then the chunks are overwritten
# shard the cars data.frame by speed so that rows with the same speed are in the same chunk iris.df = shard(iris, "Species") # clean up cars.df delete(iris.df)
Run the code above in your browser using DataLab