Learn R Programming

⚠️There's a newer version (0.3-5) of this package.Take me there.

High-performance I/O tools for R

Anyone dealing with large data knows that stock tools in R are bad at loading (non-binary) data to R. This package started as an attempt to provide high-performance parsing tools that minimize copying and avoid the use of strings when possible (see mstrsplit, for example).

To allow processing of arbitrarily large files we have added way to process chunk-wise input, making it possible to compute on streaming input as well as very large files (see chunk.reader and chunk.apply).

The next natural progress was to wrap support for Hadoop streaming. The major goal was to make it possible to compute using Hadoop Map Reduce by writing code that is very natural - very much like using lapply on data chunks without the need to know anything about Hadoop. See the WiKi page for the idea and hmr function for the documentation.

Copy Link

Version

Install

install.packages('iotools')

Monthly Downloads

1,564

Version

0.2-5

License

GPL-2 | GPL-3

Maintainer

Simon Urbanek

Last Published

January 25th, 2018

Functions in iotools (0.2-5)

chunk.apply

Process input by applying a function to each chunk
chunk.map

Map a function over a file by chunks
dstrfw

Split fixed width input into a dataframe
dstrsplit

Split binary or character input into a dataframe
as.output

Character Output
chunk

Functions for very fast chunk-wise processing
fdrbind

Fast row-binding of lists and data frames
idstrsplit

Create an iterator for splitting binary or character input into a dataframe
ctapply

Fast tapply() replacement functions
.default.formatter

Default formatter, coorisponding to the as.output functions
write.csv.raw

Fast data output to disk
output.file

Write an R object to a file as a character string
read.csv.raw

Fast data frame input
mstrsplit

Split binary or character input into a matrix
line.merge

Merge multiple sources
imstrsplit

Create an iterator for splitting binary or character input into a matrix
input.file

Load a file on the disk
readAsRaw

Read binary data in as raw
which.min.key

Determine the next key in bytewise order