datadr (version 0.8.4)

drHexbin: HexBin Aggregation for Distributed Data Frames

Description

Create "hexbin" object of hexagonally binned data for a distributed data frame. This computation is division agnostic - it does not matter how the data frame is split up.

Usage

drHexbin(data, xVar, yVar, by = NULL, xTransFn = identity,
  yTransFn = identity, xRange = NULL, yRange = NULL, xbins = 30,
  shape = 1, params = NULL, packages = NULL, control = NULL)

Arguments

data
a distributed data frame
xVar, yVar
names of the variables to use
by
an optional variable name or vector of variable names by which to group hexbin computations
xTransFn, yTransFn
a transformation function to apply to the x and y variables prior to binning
xRange, yRange
range of x and y variables (can be left blank if summaries have been computed)
xbins
the number of bins partitioning the range of xbnds
shape
the shape = yheight/xwidth of the plotting regions
params
a named list of objects external to the input data that are needed in the distributed computing (most should be taken care of automatically such that this is rarely necessary to specify)
packages
a vector of R package names that contain functions used in fn (most should be taken care of automatically such that this is rarely necessary to specify)
control
parameters specifying how the backend should handle things (most-likely parameters to rhwatch in RHIPE) - see rhipeControl and localDiskContr

Value

  • a "hexbin" object

References

Carr, D. B. et al. (1987) Scatterplot Matrix Techniques for Large $N$. JASA 83, 398, 424--436.

See Also

drQuantile

Examples

Run this code
# create dummy data and divide it
dat <- data.frame(
  xx = rnorm(1000),
  yy = rnorm(1000),
  by = sample(letters, 1000, replace = TRUE))
d <- divide(dat, by = "by", update = TRUE)
# compute hexbins on divided object
dhex <- drHexbin(d, xVar = "xx", yVar = "yy")
# dhex is equivalent to running on undivided data:
hexbin(dat$xx, dat$yy)

Run the code above in your browser using DataLab