by Ryan Hafen

Divide and Recombine for Large, Complex Data

Methods for dividing data into subsets, applying analytical methods to the subsets, and recombining the results. Comes with a generic MapReduce interface as well. Works with key-value pairs stored in memory, on local disk, or on HDFS, in the latter case using the R and Hadoop Integrated Programming Environment (RHIPE).

Functions in datadr

Name Description
ddf-accessors Accessor methods for 'ddf' objects
localDiskConn Connect to Data Source on Local Disk
getSplitVar Extract "Split" Variable(s)
addData Add Key-Value Pairs to a Data Connection
drQuantile Sample Quantiles for 'ddf' Objects
combCollect "Collect" Recombination
drAggregate Division-Agnostic Aggregation
combRbind "rbind" Recombination
readHDFStextFile Experimental HDFS text reader helper function
drFilter Filter a 'ddo' or 'ddf' Object
removeData Remove Key-Value Pairs from a Data Connection
print.ddo Print a "ddo" or "ddf" Object
divide Divide a Distributed Data Object
addTransform Add a Transformation Function to a Distributed Data Object
ddo-ddf-accessors Accessor Functions
divide-internals Functions used in divide()
setupTransformEnv Set up transformation environment
makeExtractable Take a ddo/ddf HDFS data object and turn it into a mapfile
mrExec Execute a MapReduce Job
hdfsConn Connect to Data Source on HDFS
ddo-ddf-attributes Managing attributes of 'ddo' or 'ddf' objects
drGetGlobals Get Global Variables and Package Dependencies
adult "Census Income" Dataset
bsv Construct Between Subset Variable (BSV)
ddf Instantiate a Distributed Data Frame ('ddf')
datadr-package datadr
recombine Recombine
drLapply Apply a function to all key-value pairs of a ddo/ddf object
drSubset Subsetting Distributed Data Frames
condDiv Conditioning Variable Division
applyTransform Apply transformation function(s)
combDdo "DDO" Recombination
combMeanCoef Mean Coefficient Recombination
updateAttributes Update Attributes of a 'ddo' or 'ddf' Object
print.kvValue Print value of a key-value pair Turn 'ddf' Object into Data Frame
combDdf "DDF" Recombination
print.kvPair Print a key-value pair
drJoin Join Data Sources by Key
getCondCuts Get names of the conditioning variable cuts
charFileHash Character File Hash Function
kvPair Specify a Key-Value Pair
drGLM GLM Transformation Method
ddo Instantiate a Distributed Data Object ('ddo')
drSample Take a Sample of Key-Value Pairs Take a sample of key-value Pairs
drRead.table Data Input
mr-summary-stats Functions to Compute Summary Statistics in MapReduce
drLM LM Transformation Method
kvApply Apply Function to Key-Value Pair
drBLB Bag of Little Bootstraps Transformation Method
kvPairs Specify a Collection of Key-Value Pairs
drPersist Persist a Transformed 'ddo' or 'ddf' Object
localDiskControl Specify Control Parameters for MapReduce on a Local Disk Connection
readTextFileByChunk Experimental sequential text reader helper function
rrDiv Random Replicate Division
%>% Pipe data
to_ddf Convert dplyr grouped_df to ddf
rhipeControl Specify Control Parameters for RHIPE Job
as.list.ddo Turn 'ddo' / 'ddf' Object into a list
convert Convert 'ddo' / 'ddf' Objects
combMean Mean Recombination
digestFileHash Digest File Hash Function
flatten "Flatten" a ddf Subset
drHexbin HexBin Aggregation for Distributed Data Frames
Type Package
Date 2016-03-14
License BSD_3_clause + file LICENSE
LazyLoad yes
LazyData yes
NeedsCompilation no
RoxygenNote 5.0.1
Packaged 2016-03-14 18:58:23 UTC; hafen
Repository CRAN
Date/Publication 2016-03-14 23:55:14

