Learn R Programming

⚠️There's a newer version (0.1.2) of this package.Take me there.

ddR (version 0.1.1)

Distributed Data Structures in R

Description

Provides distributed data structures and simplifies distributed computing in R.

Copy Link

Version

Install

install.packages('ddR')

Monthly Downloads

16

Version

0.1.1

License

GPL (>= 2) | file LICENSE

Maintainer

Edward Ma

Last Published

November 12th, 2015

Functions in ddR (0.1.1)

colSums,DObject-method

Get the column sums for a distributed array or data.frame.
repartition

Repartitions a distributed object. This function takes two inputs, a distributed object and a skeleton. These inputs must both be distributed objects of the same type and same dimension. If 'dobj' and 'skeleton' have different internal partitioning, this function will return a new distributed object with the same internal data as in 'dobj' but with the partitioning scheme of 'skeleton'.
dframe

Creates a distributed data.frame with the specified partitioning and data.
dmapply

Distributed version of mapply. Similar to R's 'mapply', it allows a multivariate function, FUN, to be applied to several inputs. Unlike standard mapply, it always returns a distributed object.
dlapply

Distributed version of 'lapply'. Similar to dmapply, but permits only one iterable argument, and output.type is always 'dlist'.
get_parts

Gets the partitions to a distributed object, given an index.
is.dobject

Returns whether the input entity is a DObject
cbind,DObject-method

Column binds the objects.
combine

Combines a list of partitions into a single distributed object. (can be implemented by a frontend wrapper without actually combining data in storage).
colnames,DObject-method

Gets the colnames for the distributed object.
init

Called when the backend driver is initialized.
as.dlist

Creates a distributed list from the input.
darray

Creates a distributed array with the specified partitioning and contents.
colMeans,DObject-method

Gets the column means for a distributed array or data.frame.
names<-,DObject-method

Sets the names of a distributed object
do_dmapply

Backend-specific dmapply logic. This is a required override for all backends to implement so dmapply works.
dlist

Creates a distributed list with the specified partitioning and data.
dimnames<-,DObject,list-method

Sets the dimnames for the distributed object.
[

Extract parts of a distributed object.
do_collect

Backend implemented function to move data from storage to the calling context (node).
psize

Return sizes of each partition of the input distributed object.
rbind

rbindddR
rowMeans,DObject-method

Gets the row means for a distributed array or data.frame.
is.sparse_darray

Returns whether the input is a sparse_darray
parts

Retrieves, as a list of independent objects, pointers to each individual partition of the input.
getBestOutputPartitioning

This is an overrideable function that determines what the output partitioning scheme of a dlapply or dmapply function should be. It determines the 'ideal' nparts for the output if it is not supplied. For API standard-enforcement, overriding this is not recommended.
nparts

Returns a 2d-vector denoting the number of partitions existing along each dimension of the distributed object, where the vector==c(partitions_per_column, partitions_per_row). For a dlist, the value is equivalent to c(totalParts(dobj),1).
rbind,DObject-method

row binds the arguments
collect

Fetch partition(s) of 'darray', 'dframe' or 'dlist' from remote workers.
is.dframe

Returns whether the input is a dframe
totalParts

Returns the total number of partitions of the distributed object. The result is same as prod(nparts(dobj))
parallel

The default parallel driver
shutdown

Called when the backend driver is shutdown.
dimnames,DObject-method

Gets the dimnames for the distributed object.
ddRDriver-class

The base S4 class for backend driver classes to extend.
[[,DObject,numeric-method

Extracts a single element of a distributed object.
getPartitionIdsAndOffsets

Gets the internal set of partitions, and offsets within each partition, of a set 1d or 2d-subset indices for a distributed object
rownames,DObject-method

Gets the rownames for the distributed object.
useBackend

Sets the active backend driver. Functions exported by the 'ddR' package are dispatched to the backend driver. Backend-specific initialization parameters may be passed into the ellipsis (...) part of the function arguments.
$,DObject-method

Extracts elements of a distributed object matching the name.
DObject-class

The baseline distributed object class to be extended by each backend driver. Backends may elect to extend once for all distributed object types ('dlist', 'darray', 'dframe,', etc.) for one per type, depending on needs.
is.darray

Returns whether the input is a darray
as.dframe

Convert input matrix or data.frame into a distributed data.frame.
mean,DObject-method

Gets the mean value of the elements within the object.
sum,DObject-method

Gets the sum of the objects.
ddR

Distributed Data-structures in R
rowSums,DObject-method

Gets the row sums for a distributed array or data.frame.
as.darray

Convert input matrix into a distributed array.
is.dlist

Returns whether the input is a dlist
dds

Distributed Data-structures in R