datadr v0.8.4

0

Monthly downloads

0th

Percentile

by Ryan Hafen

Divide and Recombine for Large, Complex Data

Methods for dividing data into subsets, applying analytical methods to the subsets, and recombining the results. Comes with a generic MapReduce interface as well. Works with key-value pairs stored in memory, on local disk, or on HDFS, in the latter case using the R and Hadoop Integrated Programming Environment (RHIPE).

Functions in datadr

Name Description
ddf-accessors Accessor methods for 'ddf' objects
localDiskConn Connect to Data Source on Local Disk
getSplitVar Extract "Split" Variable(s)
addData Add Key-Value Pairs to a Data Connection
drQuantile Sample Quantiles for 'ddf' Objects
combCollect "Collect" Recombination
drAggregate Division-Agnostic Aggregation
combRbind "rbind" Recombination
readHDFStextFile Experimental HDFS text reader helper function
drFilter Filter a 'ddo' or 'ddf' Object
removeData Remove Key-Value Pairs from a Data Connection
print.ddo Print a "ddo" or "ddf" Object
divide Divide a Distributed Data Object
addTransform Add a Transformation Function to a Distributed Data Object
ddo-ddf-accessors Accessor Functions
divide-internals Functions used in divide()
setupTransformEnv Set up transformation environment
makeExtractable Take a ddo/ddf HDFS data object and turn it into a mapfile
mrExec Execute a MapReduce Job
hdfsConn Connect to Data Source on HDFS
ddo-ddf-attributes Managing attributes of 'ddo' or 'ddf' objects
drGetGlobals Get Global Variables and Package Dependencies
adult "Census Income" Dataset
bsv Construct Between Subset Variable (BSV)
ddf Instantiate a Distributed Data Frame ('ddf')
datadr-package datadr
recombine Recombine
drLapply Apply a function to all key-value pairs of a ddo/ddf object
drSubset Subsetting Distributed Data Frames
condDiv Conditioning Variable Division
applyTransform Apply transformation function(s)
combDdo "DDO" Recombination
combMeanCoef Mean Coefficient Recombination
updateAttributes Update Attributes of a 'ddo' or 'ddf' Object
print.kvValue Print value of a key-value pair
as.data.frame.ddf Turn 'ddf' Object into Data Frame
combDdf "DDF" Recombination
print.kvPair Print a key-value pair
drJoin Join Data Sources by Key
getCondCuts Get names of the conditioning variable cuts
charFileHash Character File Hash Function
kvPair Specify a Key-Value Pair
drGLM GLM Transformation Method
ddo Instantiate a Distributed Data Object ('ddo')
drSample Take a Sample of Key-Value Pairs Take a sample of key-value Pairs
drRead.table Data Input
mr-summary-stats Functions to Compute Summary Statistics in MapReduce
drLM LM Transformation Method
kvApply Apply Function to Key-Value Pair
drBLB Bag of Little Bootstraps Transformation Method
kvPairs Specify a Collection of Key-Value Pairs
drPersist Persist a Transformed 'ddo' or 'ddf' Object
localDiskControl Specify Control Parameters for MapReduce on a Local Disk Connection
readTextFileByChunk Experimental sequential text reader helper function
rrDiv Random Replicate Division
%>% Pipe data
to_ddf Convert dplyr grouped_df to ddf
rhipeControl Specify Control Parameters for RHIPE Job
as.list.ddo Turn 'ddo' / 'ddf' Object into a list
convert Convert 'ddo' / 'ddf' Objects
combMean Mean Recombination
digestFileHash Digest File Hash Function
flatten "Flatten" a ddf Subset
drHexbin HexBin Aggregation for Distributed Data Frames
No Results!

Last month downloads

Details

Type Package
Date 2016-03-14
License BSD_3_clause + file LICENSE
URL http://tessera.io/docs-datadr
LazyLoad yes
LazyData yes
NeedsCompilation no
RoxygenNote 5.0.1
Additional_repositories http://ml.stat.purdue.edu/packages
Packaged 2016-03-14 18:58:23 UTC; hafen
Repository CRAN
Date/Publication 2016-03-14 23:55:14

Include our badge in your README

[![Rdoc](http://www.rdocumentation.org/badges/version/datadr)](http://www.rdocumentation.org/packages/datadr)