Learn R Programming

⚠️There's a newer version (0.8.6.1) of this package.Take me there.

datadr: Divide and Recombine in R

datadr is an R package that leverages RHIPE to provide a simple interface to division and recombination (D&R) methods for large complex data.

To get started, see the package documentation and function reference located here.

Visualization tools based on D&R can be found here.

Installation

options(repos = c(tessera = "http://packages.tessera.io", getOption("repos")))
install.packages("datadr")

Alternatively, you can install directly from github:

devtools::install_github("tesseradata/datadr")

License

This software is currently under the BSD license. Please read the license document.

Acknowledgement

datadr development is sponsored by:

  • U.S. Department of Defense Advanced Research Projects Agency, XDATA program
  • U.S. Department of Homeland Security, Science and Technology Directorate, Homeland Security Advanced Research Projects Agency (HSARPA)
  • Pacific Northwest National Laboratory, operated by Battelle for the U.S. Department of Energy, LDRD Program, Signature Discovery and Future Power Grid Initiatives

Copy Link

Version

Install

install.packages('datadr')

Monthly Downloads

11

Version

0.8.5

License

BSD_3_clause + file LICENSE

Maintainer

Ryan Hafen

Last Published

March 16th, 2016

Functions in datadr (0.8.5)

getCondCuts

Get names of the conditioning variable cuts
combDdf

"DDF" Recombination
addData

Add Key-Value Pairs to a Data Connection
drRead.table

Data Input
drPersist

Persist a Transformed 'ddo' or 'ddf' Object
drQuantile

Sample Quantiles for 'ddf' Objects
adult

"Census Income" Dataset
divide

Divide a Distributed Data Object
as.data.frame.ddf

Turn 'ddf' Object into Data Frame
bsv

Construct Between Subset Variable (BSV)
digestFileHash

Digest File Hash Function
flatten

"Flatten" a ddf Subset
ddo-ddf-accessors

Accessor Functions
combRbind

"rbind" Recombination
drSample

Take a Sample of Key-Value Pairs Take a sample of key-value Pairs
drGetGlobals

Get Global Variables and Package Dependencies
drAggregate

Division-Agnostic Aggregation
print.kvPair

Print a key-value pair
mr-summary-stats

Functions to Compute Summary Statistics in MapReduce
drFilter

Filter a 'ddo' or 'ddf' Object
removeData

Remove Key-Value Pairs from a Data Connection
hdfsConn

Connect to Data Source on HDFS
datadr-package

datadr
combMeanCoef

Mean Coefficient Recombination
drJoin

Join Data Sources by Key
charFileHash

Character File Hash Function
condDiv

Conditioning Variable Division
drBLB

Bag of Little Bootstraps Transformation Method
ddf

Instantiate a Distributed Data Frame ('ddf')
drLapply

Apply a function to all key-value pairs of a ddo/ddf object
getSplitVar

Extract "Split" Variable(s)
mrExec

Execute a MapReduce Job
drSubset

Subsetting Distributed Data Frames
convert

Convert 'ddo' / 'ddf' Objects
localDiskConn

Connect to Data Source on Local Disk
combDdo

"DDO" Recombination
applyTransform

Apply transformation function(s)
ddo

Instantiate a Distributed Data Object ('ddo')
drLM

LM Transformation Method
ddf-accessors

Accessor methods for 'ddf' objects
%>%

Pipe data
rhipeControl

Specify Control Parameters for RHIPE Job
readTextFileByChunk

Experimental sequential text reader helper function
to_ddf

Convert dplyr grouped_df to ddf
addTransform

Add a Transformation Function to a Distributed Data Object
print.ddo

Print a "ddo" or "ddf" Object
ddo-ddf-attributes

Managing attributes of 'ddo' or 'ddf' objects
divide-internals

Functions used in divide()
readHDFStextFile

Experimental HDFS text reader helper function
rrDiv

Random Replicate Division
drHexbin

HexBin Aggregation for Distributed Data Frames
combCollect

"Collect" Recombination
makeExtractable

Take a ddo/ddf HDFS data object and turn it into a mapfile
updateAttributes

Update Attributes of a 'ddo' or 'ddf' Object
kvApply

Apply Function to Key-Value Pair
drGLM

GLM Transformation Method
kvPairs

Specify a Collection of Key-Value Pairs
combMean

Mean Recombination
localDiskControl

Specify Control Parameters for MapReduce on a Local Disk Connection
setupTransformEnv

Set up transformation environment
print.kvValue

Print value of a key-value pair
as.list.ddo

Turn 'ddo' / 'ddf' Object into a list
kvPair

Specify a Key-Value Pair
recombine

Recombine